An Analytical Study of Synonymy in Using WorldNet: Classification and Structure

Shikhar Kr. Sarma Himadri Bharali Mayashree Mahanta Gauhati University Gauhati University ,,. Guwahati,Assam,India. Guwahati,Assam,India. [email protected] [email protected] [email protected] Utpal Saikia Dibyajyoti Sarmah Gauhati University Gauhati University Guwahati,Assam,India. Guwahati,Assam,India. [email protected] [email protected]

with completely identical meaning. It is general- Abstract ly accepted that complete synonym is rare in nat- ural language. The discussion of synonyms comes under the study of lexical relation. Lexical relation analyses the meaning of the words in the The present paper aims to categorize different language which have related meanings. The idea types of synonymous words and also to high- of synonym is not only applied to lexical items, light their synonymic pattern as well as gram- but also idioms, larger expressions, of course. A matical categories found in Wordnet of As- samese language. Synonymy is an important lexicographer builds a synonym dictionary de- component of vocabulary of the language. It pending on the words which share the same se- establishes lexical relation between words. In mantic features in a given language. fact, the term ‘synonymy’ is applied to the two The present paper deals with lexical synonyms of or more words which share the same semantic the same word class, not with the phrasal syno- features. WorldNet is a lexical database con- nyms. We will categorize the synonymous words sisting of synsets. A synset is constructed by considering the semantic features of the words assembling a set of synonyms that together de- they share based on Assamese Wordnet. Besides, fine a unique sense and synset is the basic it is also an attempt to point out the synonymic foundation of Wordnet. Assamese language is pattern and the grammatical categories of synsets rich in synonyms. In Assamese WorldNet, more than 20,000 synsets are entered under the in Wordnet. categories of Noun, Verb, Adverb and Adjec- tive. These synsets can of different types ac- 2 A rapid sketch of Assamese Language cording to their semantic similarity, connota- Assamese is the easternmost New Indo-Aryan tion, denotation, stylistic variations etc. language spoken in the Brahmaputra valley com- prising at present six districts with Lakhimpur in 1 Introduction the extreme east and Goalpara in the west. Synonym is an important feature of the vocabu- Tibeto-Burman and the Khasi are the important lary of any language. But it is very difficult to ones. According to the 1991 census report, the give a clear, precise and correct definition of number of speakers of the language is almost synonymy. There are various approaches with 100000 billions. However, it is spoken as a se- numerous definitions of synonym and types of cond language by a considerable number of synonyms. Linguistically, two or more words in speakers of Tibeto-Burman languages like Bodo, the same language with very closely related Mising and Karbi. Traditionally, it has served as meaning are called synonyms. It is to be men- the lingua-franca or pidgin in the neighbouring tioned here that synonyms does not mean the states of Nagaland and Arunachal Pradesh. ‘sameness of meaning’ as there is no two terms The word ‘Assamese’ is an English one based and ‘desaja’. But his ‘Desaja’ words are shown on the anglicized form ‘Assam’ from the native as loan words in which maximum words are word ‘Asam’. The word Assam was connected Perso-Arabic words (Pathak, 2004). Therefore, with the Shan invaders of the Brahmaputra val- his vocabulary classification cannot be taken as ley during the 13th century. In modern Assamese, valid. On the other hand, though Banikanta Shan invaders of the 13th century are termed as Kakati’s classification of Assamese vocabulary ‘Ahoms’. covers almost all the aspects, yet his classifica- Presence of Assamese language dated back to tion also cannot be regarded as valid one. the literatures of Charyapadas, written by It is interesting to note here that there are a Budhist scholars. The Assamese language pre- large amount of loan words in Assamese lan- sent in charyapadas reflects its evolutionary stag- guage. In day-to-day life these loan words have es in initial state. Literatures with distinct As- been used extremely to express feelings, ideas samese language are found from the Kavyas of etc. Moreover, it is seen that Perso-Arabic words the pre Sankari era. This was in 13th century have been used in Assamese language. These AD. From that time onwards pure Assamese lan- words occupy a significant status in Assamese guage with its structured forms evolved language. (Goswami, 1983). Assamese script is derived Assamese vocabulary can be divided into the from Brahmi script. It played a vital role in the following heads (Sarma et al., 2012): evolution of the Indian script. The rock inscrip- tion and copper plate from 5th to 9th century 1. Aryan or words of Sanskrit origin showed the evolution of Assamese script. There a. Tatsama are eight vowel phonemes in Assamese. There b. Semi-tatsama are twenty-one consonant and two semi-vowel c. Tadbhava phonemes in the Standard Colloquial Assamese 2. Non-Aryan words (Kakati, 2008). a. Austro-Asiatic b. Tibeto-Burmese 2 A Brief Discussion of Assamese Vo- c. Tai-Ahom cabulary d. Dravidian 3. Loan words The scope of Assamese vocabulary is very vast. a. Words coming from N.I.A. lan- It consists of words of Sanskrit origin, Non- guages Aryan words, dialect oriented words. Besides b. Foreign Words Assamese socio-cultural influences are also per- i.Persian ceived in the vocabulary of the language. It is to ii.Arabic be noted here that Assamese still lacks a com- iii.Portugese mon vocabulary dictionary in the language. iv.English Moreover, no dictionary was found in the early c. Loan translations and the middle ages. The selected modern dic- i.Translated words tionaries are – ‘A Dictionary in Assamese and ii.Terminology English by Miles Bronson’ (1867); ‘Hemkosh’ 4. Unclassified words (1900), by Hemchandra Barua and later it is th a. Hybrid compiled by Debananda Barua (the 14 edition) b. Onomatopoetic which included 1, 54,428 words; ‘Chandrakanta c. Compound Abhidhan’ (2004, 3rd edition) ‘Adhunik Asomiya Sabdakosh’ (2007, 9th edition), ‘Asamiya Jatiya 3 Wordnet and Synonym Sets Building Abhidhan’ (2010) and many other vocabulary in Assamese Language dictionaries are available in Assamese language. No common standard vocabulary dictionary has Wordnet is a repository of words of a language. been made till today. Many critics have prepared It is basically a synonymous lexical database. vocabulary lists in their own way. Earlier philol- The words are classed together according to their ogists like Kaliram Medhi and Banikanta kakati similarity of meanings. Vocabulary plays a main had classified vocabulary list in their own style. role in building Wordnet. The task of Assamese Kaliram Medhi in ‘Asomiya Byakaran aru Wordnet building is almost ready to provide us Bhasatattva’ has provided a classification As- with all the lexical words. Though Assamese samese vocabulary such as ‘tatsama’, ‘tatbhava’ wordnet tries to cover all the Assamese word forms, yet there are still many words in the lan- pending on its resemblance of meaning, distribu- guage those need to be entered (Sarma et al., tion, style, form etc. 2010) 3.2 Near Synonymy 3.1 Classification of Synonymy in As- Near synonyms are those words whose meaning samese is relatively close or more or less similar, but not Assamese language is rich in synonyms. We can fully intersubstitutable. They vary in terms of classify synonyms under the following three their shades of denotation, connotation, heads: implicature, emphasis or register. Near syno- 1. Absolute synonymy: nyms are extensively found in Assamese. For Words can be called absolutely synonymous if example: they share the complete semantic features in all ভাল ‘bhaal’ (good): সজ্জন,সত্ ‘sajjan, sat’ contexts of occurrences. However, it is generally All these words denote the quality of goodness. recognized that absolute synonyms are almost But they differ from one another in respect to non-existent. Though it is very rare, it certainly their denotational meaning. The word exists in Assamese languages. It is limited most- ভাল ly to dialectical variations and technical or insti- ‘bhaal’ is a generic term, whereas সজ্জন ‘sajjan’ tutional terms. For example: is more particular applicable only to human be- বিদ্যালয় ‘bidyaaloi’ (school): পঢাশালী, পাঠশালা ing. Besides, সত্ ‘sat’ conforms to both animate ‘parhaashaalii, haathshaalaa’ and inanimate things. The usages of these synsets খিৰ ‘khabar’ (news): িাতবৰ, সংিাদ, সংিাদ পত্ৰ, are shown below- িাতবৰ কাকত ‘baatari, sangbaad, sangbaad hatra, ভাল ‘bhaal’ - ভাল / কাম/ বকতাপ ‘bhaal baatari kaakati’ byakti/kaam/kitaap’ (good person/work/book) 2. Stylistic synonymy: সজ্জন ‘sajjan’- সজ্জন / *কাম/ *বকতাপ Stylistic synonyms are words conveying the ‘sat’- / / * same concept but differing in stylistic connota- সত্ সত্ কাম বকতাপ tions. Stylistic synonymy is very common in As- Near-synonyms can vary as follows- samese language. For example- Type of varia- মৃত্যু ‘mrityu’ (death): Examples tion মৰণ, প্রয়াণ, প্রাণত্যাগ, মহাপ্রয়াণ, বিকু প্রয়াণ, বতৰৰাধান, কমম, চাকবৰ ‘karma, chaakari’ , , , , ‘maran, Collocational বতৰৰাভাৱ কাল ললাকান্তৰ কালগ্রাম লদহাৱসান (work) prayaan, praantyaag, mahaaprayaan,boikuntha prayaan, tirodhaan, tirobhaabh, kaal, lokaantar, Stylistic, for- সানীয়, মাননীয়, মািৰ kaalagraam, dehaawasaan’ mality ‘sanmaniiya, maananiiya, maanyabar’ (honourable) সুন্দৰ ‘sundar’ (beautiful): ধুনীয়া,লদখবনয়াৰ,쇂পহ, Stylistic, dhansha, patan’ লমাহনীয়,নয়নাবভৰাম,চকুত লগা, চকু জুৰৰাৱা, নয়ন ধ্বংস, পতন forced (destruction) জুৰৰাৱা, বিৰতাপন,চকুত চমক লৰগাৱা ‘dhuniaa, dekhaniyaar, rupah, mohaniiya, nayanaabhiraam, Expressed atti- ক্ষীণ, লাহী, শুকান khin, laahii, sakut lagaa, saku juruwaa, nayan juruwaa, sakut tude sukaan thin samak lagowaa’ 3. Ideographic synonymy: maa, aai, matri Emotive মা, আই, মাতৃ Ideographic synonyms convey the same concept (mother) but differ in denotations. It is also called denota- tion based synonymy. For example- Continuous- বনগৰা, লিাৱা nigaraa, bowaa ness টুকুৰা ‘tukuraa’ (a piece): চকল, ল াখৰ ‘chakal, to drip, to flow dokhar’ িনবন, িন, হাবি, জংঘল, Fuzzy bounda- খং ‘khong’ (anger): লরাধ, ৰাগ, লকাপ, লরাধাবি ry banani, ban, haabi, janghal, ‘krodh, raag, kop, krodhaagni’ aranya’ (wood) Apart from these, we can have the following more synonym types in Assamese language de- Table 1: Type of variation The first column in the table 1 represents the var- and denotation. Examples of Cognitive syno- ious classifications of Near-synonyms and in the nyms- next column, the examples of respective Near- লগাপন ‘gopan’ (secret): অপ্রকাশ্য, 巁প্ত, 巁পুত synonym types are given accordingly. The above ‘aprakaashya, gupta, guput’ mentioned Near-synonym variations are seemed ‘deutaa’ (father): , , , , to be almost near in their meanings, but most of লদউতা বপতা পাপা বপতাই আতা them differ in their distributions. The distribution িািা, বপতৃ ‘pitaa, paapaa, pitai, aataa, baabaa, pitri’ of the first type of variation of Near-synonym in 3.5 Euphemism Synonymy the Table 1 is shown below: Collocational: কমম স্থান ‘karma sthaan) (work place) Euphemism is the substitution of words of mild * ‘chaakari sthaan’ (Work or vague connotations for expressions rough, চাকবৰ স্থান unpleasant. These kinds of synonyms are im- place) portant linguistic tools that are inherent in our 3.3 Connotation Based Synonymy language. Most of the people like to use in day to day conversation. The use of such words is both More modern approach to classify synonyms social and emotional. Euphemism deals with the may be based on definition of synonymous touchy or taboo subjects (like sex, personal ap- words differing in connotation. The scope of pearance or religion) without hurting or upsetting connotation based synonyms is very vast one. others (Radulović, 2012). As a matter of fact, Connotation based synonyms in Assamese lan- euphemism can be of two types: (a) Positive eu- guage are categorized in the following types: phemisms increase acceptability such as, domes- ticity, institutional, economical etc., and (b) neg- ative euphemisms decrease negative values that a. Connotation of degree or intensity: are associated with negative phenomena such as, , , ‘aacharit, abaak, আচবৰত অিাক স্তবিত war, drunkenness, crime, poverty (Rawson, stambhita’ (surprise) 1981). For example: b. Connotation of duration: , জুবম লচাৱা Positive euphemism: স্তন ‘stan’ (breast): বপয়াহ, ‘jupi chowaa, jumi chowaa, ভূমুবকয়াই লচাৱা পৰয়াধৰ, পৰয়াভাৰ, কুচকুি, ওহাৰ, িাত ‘ bhuumukiyaai chowaa’ (to peep) Negative euphemism: লিশ্যা ‘beshyaa’ (prosti- c. Emotive Connotation: , অকলশৰীয়া শূতা tute): , , , , ‘akalshariiyaa, shunyataa’ (loneliness) গবণকা লৰী লদৰহাপজীৱী পবততা নবটনী ‘ganikaa, rendii, dehohajiibii, patitaa, natinii’ d. Evaluative Connotative: It conveys speaker’s attitude as good or bad. For 4 Synonymic Pattern and Grammatical example: , , Catagories in Assamese Wordnet ‘pryakhyaat, janaajaat, bikhyaat’ (fa- mous) There is no fixed pattern of synonymous words in a synset in Assamese wordnet. Sometimes e. Causative Connotation: পকা, পবৰপক্ক লহাৱা only one word is provided for one concept in the ‘pakaa, paripakka howaa’ (to ripe) Wordnet. In certain concepts, it covers up to 38 f. Connotation of manner: , লসানকাৰল synonymous words. Here, we can take the fol- ততাবলৰক, খৰকক, শীৰে ‘sonkaale, tataalike, lowing example: kharkoi, shiighre’ (fast) Concept: AG: লঘহুঁ আবদৰ বিৰশষ প্রকাৰৰ খহটা চূণম EG: milled product of durum wheat (or other 3.4 Cognitive Synonymy hard wheat) used in pasta Cognitive synonymy is also known as descrip- Synset: চুবজ ‘suji’ tive, propositional or referential synonymy. Cog- nitive synonym is sometimes described as in- Concept: AG: ধমম গ্রন্থৰ দ্বাৰা স্বীকৃত এক সৰবোচ্চ সত্তা, complete synonymy (Lyons, 1981), or non- বয সৃবিৰ গৰাকী absolute or partial synonymy (Lyons, 1996). EG: the supernatural being conceived as the per- Cognitive synonymy highlights the fact that fect and omnipotent and omniscient originator though not all speakers of a language will neces- and ruler of the universe; the object of worship in sarily use, yet they may understand it well. Cog- monotheistic religions nitive synonymy is also termed as denotative synonymy (Stanojević, 2009) It analyzes sense Synset: ভগৱান, ঈশ্বৰ, প্রভু, বিধাতা, পৰমবপতা, দয়াময়, forms are not considered. Besides, though bor- , , , , , rowed words are included in synset building in সৃবিকতো পৰমব্রহ্ম জগদীশ অন্তযোমী ভুৱৰনশ্বৰ Assamese, but the numbers are very limited. Yet, ক쇁ণাময়, লনৰদখাজনা, ওপৰৰজনা, জগজীৱ, মংগলময়, there are many foreign words which we use them সিমমংগলময়, সনাতন, বিভু, ধাতা, বিধাতাপু쇁ষ, জগদীশ্বৰ, as native words in day to day communication. জগত্বপতা, জগত্পবত, জগতকতো, জগতস্রিা, জগজীউ, This kind of discussion will be dealt later some- time. Yet, wordnet with all its synsets have suc- , , , , , পৰৰমশ্বৰ পৰাত্পৰ ইচ্ছাময় পৰমাত্মা বত্ৰজগতপবত ceeded in representing Assamese language in a বত্ৰৰলাকপবত, পৰমানন্দ, বনয়ন্তা, বচন্তামবণ, ভৰিশ, বনৰাকাৰ very systematic and novel way. ‘bhagawaan, iishwar, prabhu, bidhaataa, parampitaa, dayaamoy, sristikartaa, References parambrahma, jagadiish, antarjaamii, bhuwaneswar, karunaamoy, nedekhaajanaa, Hemchandra Barua. 1900. Hemkosh, ed. and pub- lished by Debananda Barua. opararjanaa, jagajiiwa, mangalmoy, sarbamangalmoy, sanaatan, bibhu, dhaataa, Miles Bronson. 1867. Dictionary in Assamese and bidhaataapurush, jagadiishwar, jagatpitaa, English jagatpati, jagatkartaa, jagatsrastaa, jagajiiu, Sumanta Chaliha.2007. Adhunik Asomiya Sabdakosh, parameshwar, paraatpar, issaamoy, Bani Mandir, Ghy paramaatmaa, trijagatpati, trilokpati, Golock C. Goswami. 1983. Structure of Assamese, haramaananda, niyantaa, chintaamoni, bhabesh, Gauhati University, Assam Guwahati,Assam niraakaar’ Apart from these, Assamese Wordnet considers Banikanta Kakati. 2008. Assamese: Its Formation only Noun, Verb, Adverb and Adjective class. and Development, Lawyers Book Stall, Guwahati , But there are evidences of synonymous words in Assam closed classes also like preposition, conjunction John Lyons. 1977. Semantics. Vol.1. Cambridge: and Interjection etc. It may be the reason that we Cambridge University Press. can find large amount synonymous words from John Lyons. 1981. Language and Linguistics: An the open classes and also can be compared with Introduction, CUP, Cambridge. the other languages easily. John Lyons. 1996. Linguistic Semantics, CUP, Cam- Examples of Synsets according to grammatical bridge. categories are given below: Maheswar Neog and Upendranath Goswami ed., Noun: কাগজ, কাকত, তুলাপাত, লপপাৰ ‘kaagaj, 2004, Chandrakanta Abhidhan, Publication Deptt. kaakat, tulaapaat, pepaar’ (paper) Gauhati University Verb: নচা, নৃত্য কৰা ‘nachaa, nritya karaa’ (to Ramesh Pathak. 2004. Studies in Assamese Vocabu- dance) lary, Anita Pathak, Guwahati, Assam Adverb: ওচৰৰত,কাষৰত,সমীপৰত,巁বৰৰত,অ駂ৰৰত Milica Radulović: Expressing Values in Positive and ‘osarate, kaashate, samiipate, gurite, aduurate’ Negative Euphemism. Facta Universitatis, Series: (near) Linguistics and Literature Vol. 10, No 1, 2012, pp. Adjective: অধম, প্রিলতাহীন, কম, অপ্রিল, লিয়া, বনকৃি 19 – 28 ‘adham, prabalataahiin, kam, aprabal, beyaa, Hugh Rawson.1981. A Dictionary of Euphemisms nikrista’ (bad) and Other Doubletalk. New York: Crown Pub- lishers, Inc. 5 Conclusion Debabrat Sarma. 2010. Asamiya Jatiya Abhidhan, Synonymy plays a vital role in the field of lexical Asom Jatiya Prakash study. It paves the way for wordnet building in Shikhar Kr Sarma,Utpal Saikia, Mayashree Mahanta, any natural language. Synonyms in Assamese Himadri Bharali. 2012, Assamese Vocabulary and wordnet cover a large amount of lexical words Assamese Wordnet Building: An Analysis. Global coprising the grammatical categories, such as Wordnet Conference, Matsue, Japan noun, verb, adverb, adjectives. Accordingly, we Shikhar Kr. Sarma, Moromi Gogoi, Rakesh Medhi classify synonyms into certain types in Assamese and Utpal Saikia. 2010. Foundation and Structure language. of Developing an AssameseWordnet, Department It is to be mentioned here that while building of Computer Science Gauhati University, Proceed- synonym sets in Assamese wordnet, dialectical ings of the 5th Global Wordnet Conference,Narosa Publishing House. Maja Stanojević. Cognitive Synonomy: A General Overview, Linguistics and Literature Vol. 7, No 2, 2009, pp. 193 – 200, Facta Universitatis