Cross-Lingual Lexical Sememe Prediction

Total Page:16

File Type:pdf, Size:1020Kb

Cross-Lingual Lexical Sememe Prediction Cross-lingual Lexical Sememe Prediction Fanchao Qi1∗ , Yankai Lin1∗, Maosong Sun1;2y , Hao Zhu1, Ruobing Xie3, Zhiyuan Liu1 1Department of Computer Science and Technology, Tsinghua University Institute for Artificial Intelligence, Tsinghua University State Key Lab on Intelligent Technology and Systems, Tsinghua University 2Jiangsu Collaborative Innovation Center for Language Ability, Jiangsu Normal University 3Search Product Center, WeChat Search Application Department, Tencent, China {qfc17, linyk14}@mails.tsinghua.edu.cn, [email protected] [email protected] [email protected], [email protected] Abstract word apple Sememes are defined as the minimum seman- tic units of human languages. As impor- sense apple (fruit) apple (brand) tant knowledge sources, sememe-based lin- guistic knowledge bases have been widely computer used in many NLP tasks. However, most lan- fruit sememe guages still do not have sememe-based lin- guistic knowledge bases. Thus we present PatternValue able bring SpecificBrand a task of cross-lingual lexical sememe pre- diction, aiming to automatically predict se- memes for words in other languages. We Figure 1: An example of HowNet. propose a novel framework to model corre- lations between sememes and multi-lingual words in low-dimensional semantic space for bases (KBs) via manually annotating every words sememe prediction. Experimental results on with a pre-defined closed set of sememes. HowNet real-world datasets show that our proposed (Dong and Dong, 2003) is one of the most well- model achieves consistent and significant im- known sememe-based linguistic KBs. Different provements as compared to baseline methods from WordNet (Miller, 1995) which focuses on the in cross-lingual sememe prediction. The codes relations between senses, it annotates each word and data of this paper are available at https: with one or more relevant sememes. As illustrated //github.com/thunlp/CL-SP. in Fig. 1, the word apple has two senses includ- 1 Introduction ing apple (fruit) and apple (brand) in HowNet. The sense apple (fruit) has one sememe fruit, and Words are regarded as the smallest meaningful unit the sense apple (brand) has five sememes includ- of speech or writing that can stand by themselves ing computer, PatternValue, able, bring and Speci- in human languages, but not the smallest indivisi- ficBrand. There exist about 2; 000 sememes and ble semantic unit of meaning. That is, the meaning over 100 thousand labeled Chinese and English of a word can be represented as a set of semantic words in HowNet. HowNet has been widely used components. For example, “Man = human + male in various NLP applications such as word simi- + adult” and “Boy = human + male + child”. In lin- larity computation (Liu and Li, 2002), word sense guistics, the minimum semantic unit of meaning is disambiguation (Zhang et al., 2005), question clas- named sememe (Bloomfield, 1926). Some people sification (Sun et al., 2007) and sentiment classifi- believe that semantic meanings of concepts such as cation (Dang and Zhang, 2010). words can be composed of a limited closed set of However, most languages do not have such sememes. And sememes can help us comprehend sememe-based linguistic KBs, which prevents us human languages better. understanding and utilizing human languages to Unfortunately, the lexical sememes of words are a greater extent. Therefore, it is important to not explicit in most human languages. Hence, peo- build sememe-based linguistic KBs for various ple construct sememe-based linguistic knowledge languages. Manual construction for sememe- ∗ Indicates equal contribution based linguistic KBs requires efforts of many y Corresponding author linguistic experts, which is time-consuming and labor-intensive. For example, the construction of searchers. Most of related works focus on apply- HowNet has cost lots of Chinese linguistic experts ing HowNet to specific NLP tasks (Liu and Li, more than 10 years. 2002; Zhang et al., 2005; Sun et al., 2007; Dang To address the issue of the high labor cost of and Zhang, 2010; Fu et al., 2013; Niu et al., 2017; manual annotation, we propose a new task, cross- Zeng et al., 2018; Gu et al., 2018). To the best of lingual lexical sememe prediction (CLSP) which our knowledge, only Xie et al. (2017) and Jin et al. aims to automatically predict lexical sememes for (2018) conduct studies of augmenting HowNet by words in other languages. CLSP aims to assist recommending sememes for new words. How- in the annotation of linguistic experts. There are ever, both of the two works are aimed to recom- two critical challenges for CLSP: (1) There is not mend sememes for monolingual words and not ap- a consistent one-to-one match between words in plicable to cross-lingual circumstance. Accord- different languages. For example, English word ingly, our work is the first effort to automatically “beautiful” can refer to Chinese words of either perform cross-lingual sememe prediction to enrich “美丽” or “漂亮”. Hence, we cannot simply trans- sememe-based linguistic KBs. late HowNet into another language. And how to Our novel model adopts the method of word recognize the semantic meaning of a word in other representation learning (WRL). Recent years have languages becomes a critical problem. (2) Since witnessed great advances in WRL. Models like there is a gap between the semantic meanings of Skip-gram, CBOW (Mikolov et al., 2013a) and words and sememes, we need to build semantic GloVe (Pennington et al., 2014) are immensely representations for words and sememes to capture popular and achieve remarkable performance in the semantic relatedness between them. many NLP tasks. However, most WRL meth- To tackle these challenges, in this paper, we pro- ods learn distributional information of words pose a novel model for CLSP, which aims to trans- from large corpora while the valuable information fer sememe-based linguistic KBs from source lan- contained in semantic lexicons are disregarded. guage to target language. Our model contains three Therefore, some works try to inject semantic infor- modules including (1) monolingual word embed- mation of KBs into WRL (Faruqui et al., 2015; Liu ding learning which is intended for learning se- et al., 2015; Mrkšic et al., 2016; Bollegala et al., mantic representations of words for source and tar- 2016). Nevertheless, these works are all applied get languages respectively; (2) cross-lingual word to word-based KBs such as WordNet, few works embedding alignment which aims to bridge the gap pay attention to how to incorporate the knowledge between the semantic representations of words in from sememe-based linguistic KBs. two languages; (3) sememe-based word embed- There also have been plenty of studies work- ding learning whose objective is to incorporate se- ing on cross-lingual WRL (Upadhyay et al., 2016; meme information into word representations. For Ruder, 2017). Most of them require parallel cor- simplicity, we do not consider the hierarchy infor- pora (Zou et al., 2013; AP et al., 2014; Her- mation in HowNet in this paper. mann and Blunsom, 2014; Kočiskỳ et al., 2014; In experiments, we take Chinese as source lan- Gouws et al., 2015; Luong et al., 2015; Coulmance guage and English as target language to show the et al., 2015). Some of them adopt unsupervised effectiveness of our model. Experimental results or weakly supervised methods (Mikolov et al., show that our proposed model could effectively 2013b; Vulić and Moens, 2015; Conneau et al., predict lexical sememes for words with differ- 2017; Artetxe et al., 2017). There are also some ent frequencies in other languages. Our model works using a seed lexicon as the cross-lingual sig- also has consistent improvements on two auxiliary nal (Dinu et al., 2014; Faruqui and Dyer, 2014; experiments including bilingual lexicon induction Lazaridou et al., 2015; Shi et al., 2015; Lu et al., and monolingual word similarity computation by 2015; Gouws et al., 2015; Wick et al., 2016; Am- jointly learning the representations of sememes, mar et al., 2016; Duong et al., 2016; Vulić and Ko- words in source and target languages. rhonen, 2016). 2 Related Work In terms of our cross-lingual sememe prediction task, parallel data-based bilingual WRL methods Since HowNet was published (Dong and Dong, are unsuitable because most language pairs have 2003), it has attracted wide attention of re- no large parallel corpora. Besides, unsupervised methods are not appropriate either as they are gen- word embeddings. Skip-gram model is aimed at erally hard to learn high-quality bilingual word maximizing the predictive probability of context embeddings. Therefore, we choose the seed lex- words conditioned on the centered word. For- icon method in our model, and further introduce mally, taking the source side for example, given a f S ··· Sg matching mechanism that is inspired by Zhang training word sequence w1 ; ; wn , Skip-gram et al. (2017) to enhance its performance. model intends to minimize: 3 Methodology nX−K X LS − S j S mono = log P (wc+k wc ); In this section, we introduce our novel model for c=K+1 −K≤k≤K;k=06 CLSP. Here we define the language with sememe (3) annotations as source language and the language where K is the size of the sliding window. S j S without sememe annotations as target language. P (wc+k wc ) stands for the predictive probability The main idea of our model is to learn word em- of one of the context words conditioned on the cen- S beddings of source and target languages jointly tered word wc , formalized by the following soft- in a unified semantic space, and then predict se- max function: memes for words in target language according exp(wS · wS) to the words with similar semantic meanings in S j S P c+k c P (wc+k wc ) = S S ; (4) S 2 S exp(w · w ) source language.
Recommended publications
  • Semantic Shift, Homonyms, Synonyms and Auto-Antonyms
    WALIA journal 31(S3): 81-85, 2015 Available online at www.Waliaj.com ISSN 1026-3861 © 2015 WALIA Semantic shift, homonyms, synonyms and auto-antonyms Fatemeh Rahmati * PhD Student, Department of Arab Literature, Islamic Azad University, Central Tehran Branch; Tehran, Iran Abstract: One of the important topics in linguistics relates to the words and their meanings. Words of each language have specific meanings, which are originally assigned to them by the builder of that language. However, the truth is that such meanings are not fixed, and may evolve over time. Language is like a living being, which evolves and develops over its lifetime. Therefore, there must be conditions which cause the meaning of the words to change, to disappear over time, or to be signified by new signifiers as the time passes. In some cases, a term may have two or more meanings, which meanings can be different from or even opposite to each other. Also, the semantic field of a word may be expanded, so that it becomes synonymous with more words. This paper tried to discuss the diversity of the meanings of the words. Key words: Word; Semantic shift; Homonym; Synonym; Auto-antonym 1. Introduction person who employed had had the intention to express this sentence. When a word is said in *Speaking of the language immediately brings the absence of intent to convey a meaning, it doesn’t words and meanings immediately to mind, because signify any meaning, and is meaningless, as are the they are two essential elements of the language. words uttered by a parrot.
    [Show full text]
  • Words and Alternative Basic Units for Linguistic Analysis
    Words and alternative basic units for linguistic analysis 1 Words and alternative basic units for linguistic analysis Jens Allwood SCCIIL Interdisciplinary Center, University of Gothenburg A. P. Hendrikse, Department of Linguistics, University of South Africa, Pretoria Elisabeth Ahlsén SCCIIL Interdisciplinary Center, University of Gothenburg Abstract The paper deals with words and possible alternative to words as basic units in linguistic theory, especially in interlinguistic comparison and corpus linguistics. A number of ways of defining the word are discussed and related to the analysis of linguistic corpora and to interlinguistic comparisons between corpora of spoken interaction. Problems associated with words as the basic units and alternatives to the traditional notion of word as a basis for corpus analysis and linguistic comparisons are presented and discussed. 1. What is a word? To some extent, there is an unclear view of what counts as a linguistic word, generally, and in different language types. This paper is an attempt to examine various construals of the concept “word”, in order to see how “words” might best be made use of as units of linguistic comparison. Using intuition, we might say that a word is a basic linguistic unit that is constituted by a combination of content (meaning) and expression, where the expression can be phonetic, orthographic or gestural (deaf sign language). On closer examination, however, it turns out that the notion “word” can be analyzed and specified in several different ways. Below we will consider the following three main ways of trying to analyze and define what a word is: (i) Analysis and definitions building on observation and supposed easy discovery (ii) Analysis and definitions building on manipulability (iii) Analysis and definitions building on abstraction 2.
    [Show full text]
  • ON SOME CATEGORIES for DESCRIBING the SEMOLEXEMIC STRUCTURE by Yoshihiko Ikegami
    ON SOME CATEGORIES FOR DESCRIBING THE SEMOLEXEMIC STRUCTURE by Yoshihiko Ikegami 1. A lexeme is the minimum unit that carries meaning. Thus a lexeme can be a "word" as well as an affix (i.e., something smaller than a word) or an idiom (i.e,, something larger than a word). 2. A sememe is a unit of meaning that can be realized as a single lexeme. It is defined as a structure constituted by those features having distinctive functions (i.e., serving to distinguish the sememe in question from other semernes that contrast with it).' A question that arises at this point is whether or not one lexeme always corresponds to just one serneme and no more. Three theoretical positions are foreseeable: (I) one which holds that one lexeme always corresponds to just one sememe and no more, (2) one which holds that one lexeme corresponds to an indefinitely large number of sememes, and (3) one which holds that one lexeme corresponds to a certain limited number of sememes. These three positions wiIl be referred to as (1) the "Grundbedeutung" theory, (2) the "use" theory, and (3) the "polysemy" theory, respectively. The Grundbedeutung theory, however attractive in itself, is to be rejected as unrealistic. Suppose a preliminary analysis has revealed that a lexeme seems to be used sometimes in an "abstract" sense and sometimes in a "concrete" sense. In order to posit a Grundbedeutung under such circumstances, it is to be assumed that there is a still higher level at which "abstract" and "concrete" are neutralized-this is certainly a theoretical possibility, but it seems highly unlikely and unrealistic from a psychological point of view.
    [Show full text]
  • Download Article
    Advances in Social Science, Education and Humanities Research (ASSEHR), volume 312 International Conference "Topical Problems of Philology and Didactics: Interdisciplinary Approach in Humanities and Social Sciences" (TPHD 2018) Methods of Identifying Members of Synonymic Row Juliya A. Litvinova Elena A. Maklakova Chair of Foreign Languages Chair of Foreign Languages Federal State Budget Educational Institution of Higher Federal State Budget Educational Institution of Higher Education Voronezh State University of Forestry and Education Voronezh State University of Forestry and Technologies named after G.F. Morozov Technologies named after G.F. Morozov Voronezh, Russia Voronezh, Russia [email protected] Affiliation): dept. name of organization Abstract— This article is devoted to identifying the criteria of analysis, method of field modeling, method of semantic synonymity of lexical items. The existence of different definitions interpretation, method of generalization of dictionary of synonymy, the selection criteria of items in the synonymic row definitions, methods of quantitative, lexicographical, indicate the insufficient study and incoherence of this contextual, psycholinguistic analysis. Data for the study were phenomenon in linguistics. The study of the semantics of lexical 2 lexical items in the Russian language (gorodok, gorodishko) items allows explaining the most accurately and authentically the integration and differentiation of lexical items close in meaning. obtained from the Russian Explanatory Dictionaries (V. I. The description of the meaning structure (sememe) is possible Dahl, D. N. Ushakov, S. I. Ozhegov, A. P. Evgenieva, S. A. through the description of its seme composition. The methods of Kuznetsov, T. F. Efremova), Russian National Corpus seme semasiology (lexicographic, psycholinguistic, contextual) (ruscorpora.ru). allow revealing various components in the sememe structure.
    [Show full text]
  • Automatic Labeling of Troponymy for Chinese Verbs
    Automatic labeling of troponymy for Chinese verbs 羅巧Ê Chiao-Shan Lo*+ s!蓉 Yi-Rung Chen+ [email protected] [email protected] 林芝Q Chih-Yu Lin+ 謝舒ñ Shu-Kai Hsieh*+ [email protected] [email protected] *Lab of Linguistic Ontology, Language Processing and e-Humanities, +Graduate School of English/Linguistics, National Taiwan Normal University Abstract 以同©^Æ與^Y語意關¶Ë而成的^Y知X«,如ñ語^² (Wordnet)、P語^ ² (EuroWordnet)I,已有E分的研v,^²的úË_已øv完善。ú¼ø同的目的,- 研b語言@¦已úË'規!K-文^Y²路 (Chinese Wordnet,CWN),è(Ð供完t的 -文­YK^©@分。6而,(目MK-文^Y²路ûq-,1¼目M;要/¡(ºº$ 定來標記同©^ÆK間的語意關Â,因d這些標記KxÏ尚*T成可L應(K一定規!。 因d,,Ç文章y%針對動^K間的上下M^Y語意關 (Troponymy),Ðú一.ê動標 記的¹法。我們希望藉1句法上y定的句型 (lexical syntactic pattern),úË一個能 ê 動½取ú動^上下M的ûq。透N^©意$定原G的U0,P果o:,dûqê動½取ú 的動^上M^,cº率將近~分K七A。,研v盼能將,¹法應(¼c(|U-的-文^ ²ê動語意關Â標記,以Ê知X,體Kê動úË,2而能有H率的úË完善的-文^Y知 XÇ源。 關關關uuu^^^:-文^Y²路、語©關Âê動標記、動^^Y語© Abstract Synset and semantic relation based lexical knowledge base such as wordnet, have been well-studied and constructed in English and other European languages (EuroWordnet). The Chinese wordnet (CWN) has been launched by Academia Sinica basing on the similar paradigm. The synset that each word sense locates in CWN are manually labeled, how- ever, the lexical semantic relations among synsets are not fully constructed yet. In this present paper, we try to propose a lexical pattern-based algorithm which can automatically discover the semantic relations among verbs, especially the troponymy relation. There are many ways that the structure of a language can indicate the meaning of lexical items. For Chinese verbs, we identify two sets of lexical syntactic patterns denoting the concept of hypernymy-troponymy relation.
    [Show full text]
  • Applied Linguistics Unit III
    Applied Linguistics Unit III D ISCOURSE AND VOCABUL ARY We cannot deny the fact that vocabulary is one of the most important components of any language to be learnt. The place we give vocabulary in a class can still be discourse-oriented. Most of us will agree that vocabulary should be taught in context, the challenge we may encounter with this way of approaching teaching is that the word ‘context’ is a rather catch-all term and what we need to do at this point is to look at some of the specific relationships between vocabulary choice, context (in the sense of the situation in which the discourse is produced) and co-text (the actual text surrounding any given lexical item). Lexical cohesion As we have seen in Discourse Analysis, related vocabulary items occur across clause and sentence boundaries in written texts and across act, move, and turn boundaries in speech and are a major characteristic of coherent discourse. Do you remember which were those relationships in texts we studied last Semester? We call them Formal links or cohesive devices and they are: verb form, parallelism, referring expressions, repetition and lexical chains, substitution and ellipsis. Some of these are grammatical cohesive devices, like Reference, Substitution and Ellipsis; some others are Lexical Cohesive devices, like Repetition, and lexical chains (such us Synonymy, Antonymy, Meronymy etc.) Why should we study all this? Well, we are not suggesting exploiting them just because they are there, but only because we can give our learners meaningful, controlled practice and the hope of improving them with more varied contexts for using and practicing vocabulary.
    [Show full text]
  • Verbs of 'Preparing Something for Eating by Heating It in a Particular
    DEPARTAMENTO DE FILOLOGÍA INGLESA Y ALEMANA Verbs of ‘preparing something for eating by heating it in a particular way’: a lexicological analysis Grado en Estudios Ingleses Fabián García Díaz Tutora: Mª del Carmen Fumero Pérez San Cristóbal de La Laguna, Tenerife 8 de septiembre de 2015 INDEX 1. Abstract ................................................................................................................................. 3 2. Introduction .......................................................................................................................... 4 3. Theoretical perspective ........................................................................................................ 6 4. Analysis: verbs of to prepare something for eating by heating it in a particular way: cook, fry and roast. ................................................................................................................... 9 4.1. Corpus selection .............................................................................................................. 9 4.2. Verb selection ................................................................................................................ 11 5. Paradigmatic relations ....................................................................................................... 13 5.1. Semantic components and lexematic analysis ............................................................... 13 5.2. Lexical relations ...........................................................................................................
    [Show full text]
  • Lexical Semantics
    Lexical Semantics COMP-599 Oct 20, 2015 Outline Semantics Lexical semantics Lexical semantic relations WordNet Word Sense Disambiguation • Lesk algorithm • Yarowsky’s algorithm 2 Semantics The study of meaning in language What does meaning mean? • Relationship of linguistic expression to the real world • Relationship of linguistic expressions to each other Let’s start by focusing on the meaning of words— lexical semantics. Later on: • meaning of phrases and sentences • how to construct that from meanings of words 3 From Language to the World What does telephone mean? • Picks out all of the objects in the world that are telephones (its referents) Its extensional definition not telephones telephones 4 Relationship of Linguistic Expressions How would you define telephone? e.g, to a three-year- old, or to a friendly Martian. 5 Dictionary Definition http://dictionary.reference.com/browse/telephone Its intensional definition • The necessary and sufficient conditions to be a telephone This presupposes you know what “apparatus”, “sound”, “speech”, etc. mean. 6 Sense and Reference (Frege, 1892) Frege was one of the first to distinguish between the sense of a term, and its reference. Same referent, different senses: Venus the morning star the evening star 7 Lexical Semantic Relations How specifically do terms relate to each other? Here are some ways: Hypernymy/hyponymy Synonymy Antonymy Homonymy Polysemy Metonymy Synecdoche Holonymy/meronymy 8 Hypernymy/Hyponymy ISA relationship Hyponym Hypernym monkey mammal Montreal city red wine beverage 9 Synonymy and Antonymy Synonymy (Roughly) same meaning offspring descendent spawn happy joyful merry Antonymy (Roughly) opposite meaning synonym antonym happy sad descendant ancestor 10 Homonymy Same form, different (and unrelated) meaning Homophone – same sound • e.g., son vs.
    [Show full text]
  • Classification of Entailment Relations in PPDB
    Classification of Entailment Relations in PPDB CHAPTER 5. ENTAILMENT RELATIONS 71 CHAPTER 5. ENTAILMENT RELATIONS 71 R0000 R0001 R0010 R0011 CHAPTER 5. ENTAILMENT RELATIONS CHAPTER 5. ENTAILMENT71 RELATIONS 71 1 Overview equivalence synonym negation antonym R0100 R0101 R0110 R0000R0111 R0001 R0010 R0011 couch able un- This document outlines our protocol for labeling sofa able R0000 R0001 R0010 R0000R0011 R0001 R0010 R0011 R R R R R R R R noun pairs according to the entailment relations pro- 1000 1001 1010 01001011 0101 0110 0111 R R R R R R R R posed by Bill MacCartney in his 2009 thesis on Nat- 0100 0101 0110 01000111 0101 0110 0111 R1100 R1101 R1110 R1000R1111 R1001 R1010 R1011 CHAPTER 5. ENTAILMENT RELATIONS 71 ural Language Inference. Our purpose of doing this forward entailment hyponymy alternation shared hypernym Figure 5.2: The 16 elementary set relations, represented by Johnston diagrams. Each box represents the universe U, and the two circles within the box represent the sets R1000 R1001 R1010 R1000R1011 R1001 R1010 R1011 x and y. A region is white if it is empty, and shaded if it is non-empty. Thus in the carni is to build a labelled data set with which to train a R1100 R1101 R1110 R1111 bird vore CHAPTER 5. ENTAILMENT RELATIONSdiagram labeled R1101,onlytheregionx y is empty,71 indicating that x y U. \ ;⇢ ⇢ ⇢ Figure 5.2: The 16 elementary set relations, represented by Johnston diagrams. Each classifier for differentiating between these relations. R0000 R0001 R0010 R0011 feline box represents the universe U, and the two circles within the box represent the setscanine equivalence classR1100 in which onlyR partition1101 10 is empty.)R1110 These equivalenceR1100R1111 classes areR1101 R1110 R1111 x and y.
    [Show full text]
  • Automatic Text Simplification Via Synonym Replacement
    LIU-IDA/KOGVET-A{12/014{SE Linkoping¨ University Master Thesis Automatic Text Simplification via Synonym Replacement by Robin Keskis¨arkk¨a Supervisor: Arne J¨onsson Dept. of Computer and Information Science at Link¨oping University Examinor: Sture H¨agglund Dept. of Computer and Information Science at Link¨oping University Abstract In this study automatic lexical simplification via synonym replacement in Swedish was investigated using three different strategies for choosing alternative synonyms: based on word frequency, based on word length, and based on level of synonymy. These strategies were evaluated in terms of standardized readability metrics for Swedish, average word length, pro- portion of long words, and in relation to the ratio of errors (type A) and number of replacements. The effect of replacements on different genres of texts was also examined. The results show that replacement based on word frequency and word length can improve readability in terms of established metrics for Swedish texts for all genres but that the risk of introducing errors is high. Attempts were made at identifying criteria thresholds that would decrease the ratio of errors but no general thresh- olds could be identified. In a final experiment word frequency and level of synonymy were combined using predefined thresholds. When more than one word passed the thresholds word frequency or level of synonymy was prioritized. The strategy was significantly better than word frequency alone when looking at all texts and prioritizing level of synonymy. Both prioritizing frequency and level of synonymy were significantly better for the newspaper texts. The results indicate that synonym replacement on a one-to-one word level is very likely to produce errors.
    [Show full text]
  • COMPARATIVE FUNCTIONAL ANALYSIS of SYNONYMS in the ENGLISH and UZBEK LANGUAGE Nazokat Rustamova Master Student of Uzbekistan
    ACADEMIC RESEARCH IN EDUCATIONAL SCIENCES VOLUME 2 | ISSUE 6 | 2021 ISSN: 2181-1385 Scientific Journal Impact Factor (SJIF) 2021: 5.723 DOI: 10.24412/2181-1385-2021-6-528-533 COMPARATIVE FUNCTIONAL ANALYSIS OF SYNONYMS IN THE ENGLISH AND UZBEK LANGUAGE Nazokat Rustamova Master student of Uzbekistan State University of World Languages (Supervisor: Munavvar Kayumova) ABSTRACT This article is devoted to the meaningfulness of lexical-semantic relationships. Polysemic lexemes were studied in the synonymy seme and synonyms of sememe which are derived from the meaning of grammar and polysememic lexemes. Synonymic sememes and synonyms are from lexical synonyms. The grammatical synonyms, context synonyms, complete synonyms, the spiritual synonyms, methodological synonyms as well as grammar and lexical units within polysememic lexemes have been studied. Keywords: Synonymic affixes, Lexical synonyms, Meaningful (semantic) synonyms, Contextual synonymy, Full synonymy, Traditional synonyms, Stylistic synonyms. INTRODUCTION With the help of this article we pay a great attention in order to examine the meaningfulness of lexical-semantic relationships. Polysemic lexemes were studied in the synonymy seme and synonyms of sememe which are derived from the meaning of grammar and polysememic lexemes. Synonymic sememes and synonyms are from lexical synonyms. The grammatical synonyms, context synonyms, complete synonyms, the spiritual synonyms, methodological synonyms as well as grammar and lexical units within polysememic lexemes have been studied. There are examples of meaningful words and meaningful additions, to grammatical synonyms. Lexical synonyms and affixes synonyms are derived from linguistics unit. Syntactic synonyms have been studied in terms of the combinations of words, fixed connections and phrases.[1] The synonym semes are divided into several types depending on the lexical, meaning the morpheme composition the grammatical meaning of the semantic space, the syntactic relationship and the synonym syllabus based on this classification.
    [Show full text]
  • Automatic Synonym Discovery with Knowledge Bases
    Automatic Synonym Discovery with Knowledge Bases Meng Xiang Ren Jiawei Han University of Illinois University of Illinois University of Illinois at Urbana-Champaign at Urbana-Champaign at Urbana-Champaign [email protected] [email protected] [email protected] ABSTRACT Text Corpus Synonym Seeds Recognizing entity synonyms from text has become a crucial task in ID Sentence Cancer 1 Washington is a state in the Pacific Northwest region. Cancer, Cancers many entity-leveraging applications. However, discovering entity 2 Washington served as the first President of the US. e.g. Washington State synonyms from domain-specic text corpora ( , news articles, 3 The exact cause of leukemia is unknown. Washington State, State of Washington scientic papers) is rather challenging. Current systems take an 4 Cancer involves abnormal cell growth. entity name string as input to nd out other names that are syn- onymous, ignoring the fact that oen times a name string can refer to multiple entities (e.g., “apple” could refer to both Apple Inc and Cancer George Washington Washington State Leukemia the fruit apple). Moreover, most existing methods require training data manually created by domain experts to construct supervised- learning systems. In this paper, we study the problem of automatic Entities Knowledge Bases synonym discovery with knowledge bases, that is, identifying syn- Figure 1: Distant supervision for synonym discovery. We onyms for knowledge base entities in a given domain-specic corpus. link entity mentions in text corpus to knowledge base enti- e manually-curated synonyms for each entity stored in a knowl- ties, and collect training seeds from knowledge bases. edge base not only form a set of name strings to disambiguate the meaning for each other, but also can serve as “distant” supervision to help determine important features for the task.
    [Show full text]