Reverse Dictionary
Total Page:16
File Type:pdf, Size:1020Kb
Imperial Journal of Interdisciplinary Research (IJIR) Vol-2, Issue-4, 2016 ISSN: 2454-1362, http://www.onlinejournal.in Reverse Dictionary Veena Gurram1, Sweta Rathod2, Anish Lushte3, Pranay Vaidya4, Prof. Vinod Alone5 & Prof. Mahendra Pawar6 1,2,3,4B.E Student 5,6Assistant Professor 1,2,3,4,5,6Department Of Computer Engineering, PVPPCOE Sion, Mumbai 400022 Abstract— In this paper, we describe the design and dictionary are the reverse dictionaries [3][4] which is implementation of a reverse dictionary. Unlike a tra- built with certain drawbacks. The existing dictionary ditional forward dictionary, which maps from words receives an input phrase and outputs many output to their definitions, a reverse dictionary takes a user words; therefore it can be tedious for the user to input phrase describing the desired concept, and re- search one from it. Whereas, building a ranking based turns a set of candidate words that satisfy the input dictionary, allows the user to choose from a set of phrase. This work has significant application not only words which are closely related to each other. for the general public, particularly those who work Example: In the existing reverse dictionary, the user closely with words, but also in the general field of inputs a phrase “unknown name” outputs over a 100 conceptual search. search results including “anonymous”, “nameless”, “incognito”, “jane doe”, “john doe”, “some”, “sky- Index Terms—Dictionaries, thesauruses, search pro- blue pink”, “dark horse” etc. [4], But the most cess, web-based services. accurate result for “unknown name” shall be “anonymous”, “nameless” but “incognito” also I. INTRODUCTION contributes some primary meaning to the user input phrase. Whereas words like “sky-blue pink”, In this paper, we report work on creating an online “challenge”, “key” doesn’t have anything to do with reverse dictionary (RD). As opposed to a regular the concept but It is associated with the word (forward) dictionary that maps words to their defini- “unknown”. Reverse mapping Set, RMS of t, denoted tions, a RD performs the converse mapping, i.e., giv- by R(t) is mapped to the “definition” in the dictionary en a phrase describing the desired concept, it provides .Every word that contains the definition, forms the words whose definitions match the entered definition suggested output. Finally, they are arranged according phrase. For example, suppose a forward dictionary to the rank. informs the user that the meaning of the word “spelunking” is “exploring caves.” A reverse diction- III. LITERATURE SURVEY ary, on the other hand, offers the user an opportunity to enter the phrase “check out natural caves” as input, Literature survey is highlighted in reference to the and expect to receive the word “spelunking” (and performance and approach of the current system. possibly other words with similar meanings) as Existing System: After referring existing system we output. came to a conclusion that they lack the following qualities:- Effectively, the RD addresses the “word is on the tip • The existing dictionary outputs 100 results and most of my tongue, but I can’t quite remember it” problem. of them are not related to the phrase. A particular category of people afflicted heavily by • In the existing system ranking of result is not this problem are writers, including students, profes- efficient. ([16]T.Dao and T.Simpson [6]Z Wu and sional writers, scientists, marketing and advertisement M.Palmer).Therefore accurate result for a given professionals, teachers, the list goes on. In fact, for phrase is not guaranteed. most people with a certain level of education, the • Auto-correction of input phrase is not done in the problem is often not lacking knowledge of the mean- existing systems for example if user does some ing of a word, but, rather, being unable to recall the spelling mistake while entering the input phrase, appropriate word on demand. The RD addresses this thus it will end up in giving wrong results. widespread problem. IV. PROPOSED SYSTEM II. RELATED WORK In a reverse dictionary the user enters the desired In a reverse dictionary a user input phrase is given and phrase with a logic and gets a certain number of we receive certain number of words as output ranked words as the output. To do so, we have to first build with an algorithm. The related works to this reverse Imperial Journal of Interdisciplinary Research (IJIR) Page 531 Imperial Journal of Interdisciplinary Research (IJIR) Vol-2, Issue-4, 2016 ISSN: 2454-1362, http://www.onlinejournal.in the RMS(Reverse Mapping Set). Building an RMS 1.Time efficiency:-Quick output. means to find a set of words in whose definitions any word ‘w’ is found. Example: The word “sleep” is 2.Accuracy:- Gives accurate word. found in 4 definitions belonging to 4 words. Therefore R(sleep) will be “slumber”, “sopor”, “nap”, ”rest”. 3.Auto Correction:- The phrase entered by the user is These words must be manually entered for each word. corrected if the spelling is wrong. The RMS of the words can be found from the wordnet [2][6] dictionary. The stop words like “whereas, System Architecture: whenever, however, very” etc needs to be negated as they don’t form a very important part of the process. Whereas, Antonyms are needed to be addressed. Ex- ample: When the word sleep is followed by “cannot”, the antonym of “sleep”, which is “wake up” should be considered for the search process. The search can be enhanced by considering the synonyms, hypernyms and hyponyms of that particular word. When the words do not yield enough output. The synonym, hy- pernym and hyponym of that particular word is con- sidered and the RMS of those words also form a part of the output words. A synonym is the other possible meaning of the word, where as a hypernym is the common class under which the word occurs. Exam- ple: The word “parrot” belongs to “birds”, therefore the hypernymof “parrot” is birds. Whereas the hypo- Fig: System Architecture nym is the other similar birds like “macaw” etc. Con- sidering synonyms, hypernyms and hyponyms will The Reverse dictionary is a computer application increase the number of output words. i.e, if “parrot” which takes the user input phrase and gives the doesn’t yield enough results; the synonym, hyponym corresponding words as output.he RMS contains a set and hypernym of parrot will yield more results. of mappings, the dictionary definitions and parse trees [8] for definitions. The database of synonyms which The final step is to sort out the results. This is done consists of the set of synonym for individual words in when there are more number of output words. Exam- the user input phrase. The hypernym/hyponym ple: In the existing reverse dictionary, the user inputs database, which consists the hyponym and hypernym a phrase “discrimination based on colour” outputs sets for each individual word in the user input phrase, over a 100 search results including “racism”, whereas the Antonym database consists of the set of “classism”, “judgmental”, “colour bar”, “colour line”, antonym for each word in the user input phrase. “red”, “nepotism”, “fair” etc. But the most accurate result for “discrimination based on colour” shall be Process: “racism”, but “colour line”, “colour bar” also contrib- utes some primary meaning to the user input phrase. • User enters a set of words/phrases to be looked up. Whereas words like “fair”, “red” doesn’t have any- thing to do with the concept but It is associated with • Stop words, or unwanted words, which do not affect the word “colour” [3]. In order to avoid such unneces- the meaning of a phrase or have minimal meaning, sary results we decrease the number of search words. are removed from the input. Eg: “this”. By decreasing the number of outputs, we finally get the number of words which are closely associated • Remaining important words are stemmed, or con- with the search concept. Therefore, according to the verted to their base form. Eg: “describing” becomes previous example, when a user enters a phrase “dis- “describe”. crimination based on colour”, the output shall be “ra- cism”, “prejudice”, “colour line”, “colour bar”, • Words/Phrases containing negation are converted to “nepotism” etc. Inorder to avoid too many words, the their antonyms. Eg: “not pleasant” becomes “un- words are grouped together with other words , and the pleasant”. set of two words are found in the definition of a word [1]. Example: “discrimination based on colour”, the • Tokens will be generated and query will be formed word “racism” consists of both “discrimination” and using this tokens. “colour”, therefore it must be given more priority. • Expand query if results are less results are found on Advantages: basis of synonym ,hypernym and hyponym. Imperial Journal of Interdisciplinary Research (IJIR) Page 532 Imperial Journal of Interdisciplinary Research (IJIR) Vol-2, Issue-4, 2016 ISSN: 2454-1362, http://www.onlinejournal.in • Sort result on basis of word-net hierarchy by com- Domain Describing Taxonomies,”Proc. ACM Conf. paring term similarity and term importance in a Information and Knowledge Management,2006. phrase. [9] D. Lin, “An Information-Theoretic Definition of • Display Result. Similarity,”Proc.Int’l Conf. Machine Learning, pp. 296-298, 1998 V. CONCLUSION [10] Z. Wu and M. Palmer, “Verbs Semantics and In this paper, we discuss the way to build a reverse Lexical Selection,”Proc. 32nd Ann. Meeting Assoc. dictionary and also note down the existing problems for Computational Linguistics,pp. 133-138, 1994. [11] that occur in the process. We, therefore, suggest a D. Widdows and K. Ferraro, “Semantic Vectors,” group of methods for constructing and inputing a http://code.google.com/p/semanticvectors/, 2010. reverse dictionary, and show the result’s quality and also the run time and scalability.