ISSN (Print) : 0974-6846 Indian Journal of Science and Technology, Vol 9(28), DOI: 10.17485/ijst/2016/v9i28/95135, July 2016 ISSN (Online) : 0974-5645 Fuzzy Lattice for and Concept Classification

K. Selvi1* and R. M. Suresh2 1Sathyabama University, Sholinganallur, Chennai - 600119, Tamil Nadu, India; [email protected] 2Sri Lakshiammal Engineering College, Chennai.- 600126, Tamil Nadu, India; [email protected]

Abstract Objective: Methods To discover the the new latent concept, which is significant to self-learning and . To better understand the conceptual relations of the query terms. : The application of Fuzzy Formal Concept to construct a Concept LatticeFindings: that better describes the semantic relations of incoming patterns for a collection of documents. There has been little work that evaluates the effect of various techniques and parameterApplications/Improvements: settings in the word space construction The results from corpora. The present paper experimentally investigates how the choice of a particular domain helps the user to discover the information which is much closer to his preferences. generated using our novel approach has been experimented for the context of finding the Sea side Schools based on user requirements. A comparision of related papers is performed to encounter the challenging issues of Ontology construction for text classification in Semantic Web. Keywords: Fuzzy Formal Concept Lattice, Machine Learning, Ontology, Semantic Web, Web Text Analysis

1. Introduction 1.1 Principles of Ontology Construction Ontology is defined as a lexical representation of seman- Ontology construction can be categorized into two tic relations that occur between word pairs which can be approaches. One is the bottom up approach that is represented as a for any context. For widely used in and the other is the top expressing and processing knowledge, ontology plays a down approach that is most suitable for the domain with vital role in web text analysis tasks and widely used in objects and attributes4. In top down approach, the relation all machine learning . To process a language between the objects and attributes is fragmented on the is to understand and accurately apply its grammar. The basis of the object language for a particular domain that is context of lexis of an entity object for a particular field obtained by following the steps given below: consists of words that represent the properties and rela- • List the domain according to the context. tions of valid constants. We can articulate boundaries on • Find the prime lexis. the achievable value of predicates by using the axioms1. • Apply the grammar rules. Those are used to identify the semantic relations • Include the secondary terms by intended defini- between the query terms, which will implicitly follow the tions. interpretation of these primary terms. At the other end, • Extend the depth of search by finding the tertiary the axioms find the set of possible interpretation of the terms. query words that match the formal scheme into the lexes Initially the lexemes required to form the language of the language. That is why Ontology is chosen to find descriptions for a given context are listed with a well- only related information. defined lexicon. As a next step, the axioms and predicates

*Author for correspondence Fuzzy Concept Lattice for Ontology Learning and Concept Classification

that match the selected lexemes are identified. The struc- document classification namely, the naive Bay’s classifier, tural properties of the given terms and their relations the nearest neighbourhood classifier, decision trees and are identified by applying the grammar rules. Using the a subspace method. These were applied to Yahoo news vocabulary of the primary terms, the secondary and groups like , entertainment, health, international, tertiary terms are identified using the predicates of the , sports and technology individually and in com- individual terms in the domain. The formulation of axi- bination9. oms will use a variable that is representing an individual In10 proposed a Iceberg Concept Lattice which is a for example, if a is the daughter of b and b is the bother huge concept lattice that is clustered into small one based of c then c is the uncle of a, which is an relating on concept clustering algorithm10–13, which is conducive the predicates Daughter Of, Brother Of and Uncle Of. a, b to concept lattice’s scaling and displaying. In14–17 identifies and c are here understood to characterize unnamed per- articles for a specific domain with the linked classification sons. The semantic relations among the terms are clearly information available in Wiki pages. But only a concept recognized with the structural properties specified in the hierarchy can be built for the available pages. In18,19 the above steps for all iterations. authors demonstrate how ontology helps in identifying clinical predictors for coronary heart disease. The depth 1.2 Ontology Construction Approaches of the conceptual expression is easily understood which help us in implementing operational aspect of the seman- Linguistic analysis: This method is a compilation hypoth- tic web. Classification is often used as a basis for many esis of a lexical where the pecking order of machine-learning applications. As a part of data security, conceptions is designed without human intervention. knowledge discovery and classification remain and work Lexical possess words along with their syn- together during various query-processing steps21. Our onyms, root words, word origins, etc., which forms the proposed method enables attribute classification that basis for other construction methods. Combinations of several classifiers did not improve the classification accu- helps us to determine the facilities based on our require- racy as that of other individual classifier5. We tackle the ments. Data classification requires both manual process problem of information processing by an elegant way of and automatic tools to achieve high percentage of effi- 22 representing in data. The objective of this ciency . ontology construction is to better understand the seman- The major criteria of fuzzy are fuzzy sets, lin- tic relations of the incoming patterns in machine learning guistic patchy, possibility distributions, and fuzzy if and to discover and relation between the objects – then rules. Fuzziness or Degree of Uncertainty per- and its attributes. tains to the uncertainty associated with a system, i.e., the In6 finds the cluster index using c-means fuzzy clus- fact that nothing can be predicted with exact. tering where is the maximum number theory incorporates a coherent basis for machine learn- of cluster partition for ‘n’ clusters which is not satisfac- ing and also forms an elegant, statistically well founded, tory for all context. In7 identifies the cancer cells using illustration of the improbability in the data. Subsequently fuzzy enhanced mammogram approach that took less the data that are to be operated are frequently impre- time compared to the existing methods. In8 classifies cise. Application of fuzzy or its derivatives has documents based on concept and semantic relations become a common approach in recent years in all text by observing the activities and comparing with the fea- mining and machine learning techniques23–26. tures extracted accessed for the corresponding context in This paper is structured as follows. Next section con- which they occur. fers the need for Ontology Learning and explains how The lexical dictionary based method is restricted to the Concept Lattice is constructed briefly. In Section the category size of the word list and can therefore form 3, a survey of domain ontology’s and the role of Fuzzy domains having dissimilar scopes. This lexical diction- Concept is given with the various applications in the field ary based ontology has a formal portrayal which is not of Natural Language Processing. Successively, in Section pertained to a particular domain. After integrating with 4, the overview of the proposed method for constructing other methods an ontological framework is developed for concept lattices is accessible, followed by the comparative Ontology learning. There are four different techniques for study of interrelated methods used by various authors.

2 Vol 9 (28) | July 2016 | www.indjst.org Indian Journal of Science and Technology K. Selvi1 and R. M. Suresh

Finally, Section 6 projects the views of how this work can #Rule 1: Concept Node → be extended to generalize this technique for all domains. #Rule 2 :Extension collection of Concept → Instance of Class 2. Materials and Methods #Rule 3: Intension collection of Concept → Attribute of class Ontology provide a frequent indulgent of exclusive Figure 1. Conversion rules. domains that can be shared among people and particu- lar application systems. Many deficiencies still exists in Fuzzy is a overview of Fuzzy ontology. It is hard to find the granularity of ontology and Concept Analysis for exhibiting improbability of data. It the depth of concept expression. Moreover Formal con- is a binary table K = {G, M, I} called formal context, where cept analysis can help resolving the problem that ontology G is a set of Objects, M is a set of attributes and I represents is poor in describing the depth of information27. This con- has a relation. On G X M. Fuzzy Formal Concept Analysis cept lattice theory automatically builds the concept lattice can keep up ontology construction when a small num- and generates a more formalized model which increases ber of information is more significant than other data and the litheness of ontology in heterogeneous system and the user is not sure about what he/she is looking for. For instance, we consider a context named Sea side Schools, hence the operational aspect of the Semantic Web. It suppose that the set O is defined by the six objects repre- should be a better way to combine formal concept analysis senting six different Sea side Schools: O= {O1, O2, O3, with ontology to expressing and processing knowledge. O4, O5, O6} and the set A is defined by six possible attri- Therefore, we propose a novel method of fuzzy ontology butes of these objects: A= {Playground, Computerized merging based on fuzzy concept gluing. The results of Class rooms, Meal, Sea}. Furthermore, suppose the Sea our method show that the accuracy of merging is largely side Schools are related to the above attributes according improved. Moreover, it can discover the inherent con- to the binary relation defined by Table 1. cepts and relationship. Fuzzy Formal Concept Analysis can hold up ontology Table 1. The sea side schools context in (non-fuzzy) construction when the information is more significant FCA than other data or Semantic Web search when the user is not satisfied with what he retrieves. In this paper, we Sea side Playground Computerized Meal Sea Schools/ Class rooms explain the construction of Concept Lattice that is done Objects by grouping the attributes based on the fuzzy values O1 0 0 1 1 available for a set of six objects pertained to a particu- lar context. This enables the user to discover information O2 1 1 1 0 that is closer to his preferences. Fuzzy sets are derived by O3 0 0 1 1 generating the characteristic function to a membership O4 1 1 0 1 function as given in Equation (1) where the value ranges O5 0 0 1 1 from 0 to 1. O6 1 1 0 1 u(x) = (x)2 / (x)2 + 1 (1) After the processing of sections above is performed, For instance, the hotel O4 has, or is described by, the fuzzy concept lattice to be merged and the source three attributes, namely Playground, Computerized Class fuzzy concept lattice are glued into a big one. Since con- rooms, and Sea, and vice versa, these three attributes apply cept lattice is complete. Concept lattice can be clustered to the object O4. A concept of the Sea side Schools con- into concept hierarchies automatically. Therefore, con- text is, for instance (O4, O6), (Playground, Computerized cept lattice is classified to different concept resulting from Class rooms, Sea) since both O4 and O6 have the attri- nuances. In this scenario, it is necessary that domain butes Playground, Computerized Class rooms, and Sea, expert should be introduced to delete undesirable con- and vice versa, all these attributes relate to both the objects cept node. Then the fuzzy concept lattice is converted to O4, O6. Given any two concepts of a context, (E1, I1), (E2, fuzzy ontology. Figure 1 shows the conversion rule. I2), it is possible to create an inheritance relation among

Vol 9 (28) | July 2016 | www.indjst.org Indian Journal of Science and Technology 3 Fuzzy Concept Lattice for Ontology Learning and Concept Classification

them according to the following condition: In particular, which means that such attribute fully applies to the hotel (E1, I1) is called sub-concept of (E2, I2) and (E2, I2) is O2 (and vice versa the hotel O2 can be properly described called super concept of (E1, I1). by the attribute Computerized Class rooms). Instead, the object O2 has the attribute Meal with a membership value 0.5, which means that such an attribute partially applies to this hotel (for instance it could provide meals only for dinner and not always). Analogously, in the case of O3, the value 0.7 in correspondence with the attribute Sea means that this feature better describes the Sea side Schools O1, O4 or O6 than O3, but it is more appropriate to O3 than O5 (having O5 a lower rank of membership with Sea, i.e., 0.3). A threshold is fixed to address objects related to attributes with related rank of membership so that the pairs with membership values less than the threshold are ignored. The fuzzy concept lattice partition algorithm is Figure 2. Formal concept lattice. described as follow. Step 1: Each formal concept in fuzzy concept lattice is Formal Concept Lattice (Figure 2.) incorporates described as concept vector. into Formal Concept Analysis to represent Step 2: The concept lattice is traversed from top to vague information. Given a context X, a fuzzy set A in X bottom. In this process, each node of first layer except top is described by a membership function μA(x) which links layer each point in X with a real number in the interval (0,1) is labeled as individual sets L1,L2, . . . ,Ln. = {(x,μA(x))|x X}The value μA(x) represents the rank Step 3: The similarity between every layer’s node Ci of membership of x in A. Note that for a regular set, the and its child-node is calculated by Equation (2). membership function∈ can take only the values 1 and 0. Step 4: Child-node is added into sets when σL is larger Consider the Schools available near the sea side. Fuzzy than threshold T. If child-node is added into multiple sets, context specified by the fuzzy relation given in Table 2. In they are merged into one. particular, binary values in Table 1 have been replaced by Step 5: If every node’s similarity is not larger than ranks of membership, from 0 to 1, each allowing us to threshold T, the process of searching and labeling is sus- quantify how much an object has, or is described by, an pend. attribute and vice versa an attribute applies to an object Step 6: If there are no nodes to be disposed, Step 3 is executed. Otherwise Step 7 is executed. Table 2. The sea side schools context in Step 7: The labeled set is returned, and each set of (fuzzy) FFCA these is added into top layer and uppermost layer. σL(C ,C ) = [(C – C ) + (1-C *C )]/2 (2) Sea side Playground Computerized Meal Sea A B A B A B Schools/ Class rooms Objects 3. Results and Discussion O1 0.0 0.0 1.0 1.0 O2 0.6 1.0 0.5 0.0 This section compares the result given by various authors who had worked towards our objective and their tech- O3 0.0 0.0 0.5 0.7 niques include Semantic Web Search, Web Text Analysis, O4 0.8 1.0 0.0 1.0 Fuzzy Formal Concept Analysis and theory. O5 0.0 0.0 1.0 0.3 They deal with the performance measure of search O6 0.8 1.0 0.0 0.8 Engines, ontology merging or/and ontology building with the application of Formal Concept Analysis. Table 3 gives Consider the hotel O2 in Table 2. It has the attribute a comparative study of the above techniques in Ontology Computerized Class rooms with rank of membership 1.0, Construction, Clustering and Similarity Measures.

4 Vol 9 (28) | July 2016 | www.indjst.org Indian Journal of Science and Technology K. Selvi1 and R. M. Suresh

Table 3. Classified instances using zeroR classification Relation: schools-weka.filters.supervised.attribute.AddClassification-Wweka.classifiers.rules.ZeroR Instances: 6 Attributes: 5 Sea side Schools/Objects Playground Computerized Class rooms Meal sea Test mode:10-fold cross-validation === Classifier model (full training set) === Number of training instances: 6 Number of Rules : 3 Non matches covered by Majority class. Total number of subsets evaluated: 11 Merit of best subset found: 0.368 Evaluation (for feature selection): CV (leave one out) Feature set: 1,5 Time taken to build model: 0.8 seconds

Table 4. Related techniques comparative study FFCA RST SWS FCA FT Goal Chen and Lin Keywords extr. Doherty et al. X Ont. Building Formica x Similarity Hwang et al. x Ont. Building Jiang et al. X DL reasoning Miao et al. X Clustering Ngo and Nguyen X Clustering Stumme and Maedche x Ont. Merging Tho et al. X Ont. Building Zhang et al. x Ont. Building Zhao and Halang X x Similarity Zhao et al. X x Similarity Proposed Method X x DomainOntology Consruction

For the above reason, we follow the steps below dur- particular, the user query is matched with the intents of ing our experimentation: the Concept Lattice without using approximation opera- Step1: Select a set of documents for a particular con- tors and without having the possibility of choosing the text (Sea side Schools). objects that better satisfy the user needs according to Step 2: Generate a matrix with non-fuzzy value to fuzzy values. In our proposal, FFCA allows the user to construct Formal Concept Lattice by using the conver- choose the preferred answers on the basis of ranks of sion rules given in Figure 1. membership that specify how much objects are properly Step 3: Identify a set of predefined attributes that described by the searched attributes. For this reason, we match with the intents of the concept that occur in the decided to restrict the comparison to the four proposals lattice formed in step 2. addressing SWS and Formal Technique. Table 4 shows Step 4: Evaluate the results and compare with the pro- the result generated on Classification of 6 instance O1.. posals submitted for those predefined query sets. O6 with 10 fold cross validation in 0.8 seconds. Step 5: Submit the results with the ranks obtained to select the evaluators for human judgments. 4. Conclusion Step 6: Correlate the human judgment values with the results obtained to build the concept lattice with the fuzzy Constructing a concept lattice by identifying the values. semantic relations using ontology could do the task of In3 Formal Concept Analysis is used as a knowledge learning for concept classification. This paper experimen- acquisition tool to give more intensive search results. In tally investigated how the choice of a particular domain

Vol 9 (28) | July 2016 | www.indjst.org Indian Journal of Science and Technology 5 Fuzzy Concept Lattice for Ontology Learning and Concept Classification

helps the user to discover the information that is much 8. Ramana AV, Reddy EK. OCCSR: Document classification closer to his preferences. Evaluation of proposed method by order of context, concept and semantic relations. Indian produces better results, which were well suited to detect Journal of Science and Technology. 2015 Nov; 8(30):1–8. the semantic relations between Objects. It is very clearly 9. Moreno A, Sanchez D. Learning medical from shown that formal concept analysis and concept lattice the Web. Knowledge management for Health care proce- dures. Berlin Heidelberg: Springer-Verlag; 2008. p. 32–45. theory marks the building mechanize, but also marks the 10. Stumme G, Taouil R, Bastide Y, et al. Conceptual cluster- created ontology formalize further, thus increasing the ing with iceberg concept lattices. Data and Knowledge tractability of ontology in a heterogeneous system and the Engineering. 2002; 42(2):189–222. operational aspects of the Semantic Web. This approach 11. Miller GA. WordNet: A lexical database for English. could further be enhanced by increasing the space of the Communications of the ACM. 1995; 38(11):39–41. data by taking into account various types of attributes for 12. Philip R. Using information content to evaluate semantic classification and clustering. similarity in a taxonomy. Proceedings of IJCAI’95; 1995. p. 448–453. 13. Bollegala D, Matsuo Y, Ishizuka M. Measuring seman- 5. Acknowledgement tic similarity between implicit semantic relations. ACM I owe my sincere thanks to my Co-author and guide Dr. Proceedings of International World Wide Web Conference Committee; 2009. 978-1-60558-487. R.M. Suresh for his valuable mentoring and guidance in 14. Chan CW. From knowledge modeling to ontology con- preparing this paper and helping me with valid sugges- struction. International Journal of Software Engineering tions and directing me in the correct path. I thank the and Knowledge Engineering (IJSEKE). 2004; 14(6). editorial board for having given me this opportunity to 15. Cui G, Lu Q, Li W, Chen Y. Corpus Exploitation from publish my paper in this journal. I thank my parents and Wikipedia for Ontology Construction. p. 2125–32. my family members who was an instant support during 16. Chou C-H, Zahedi F, Zhoa H. Ontology for developing the period of my research. websites for natural disaster management: and implementation; 2008. 17. Selvi K, Suresh RM. Document clustering using artifi- 6. References cial neural networks. National Journal on Advances in Computing and Management. 2008; 5(1):1–5. 1. Gruber RT. Towards principles for the of ontol- 18. Rinaldi AM. An ontology-driven approach for semantic ogy’s used for knowledge sharing. International Journal of information retrieval on the web. In ACM Transactions on Human-Computer Studies. 1995; 43(1):907–28. Internet Technologies. 2009; 9(10). 2. Selvi K, Suresh RM. Measure semantic similarity between 19. Binfeng X, Xiaogang L, Cenglin P, Qian H. Based on ontol- words using fuzzy formal concept analysis. Proceedings of ogy: Construction and application of medical knowledge IRNet; India. 2012. P. 31–4. base. IEEE International Conference on Complex Medical 3. Selvi K, Suresh RM. An efficient technique to implement Engineering; 2007. p. 586–9. similarity measures in text document clustering using 20. Swartout B, Patil R, Knight K, Russ T. Toward distributed artificial neural networks algorithm. Research Journal use of large-scale ontologies. AAAI Technical Report; 2007. of Applied Sciences Engineering and Technology. 2014; SS-97-06. 8(23):2320–28. 21. Yang BR, Zheng DQ, Yang J, Liu L, Jia MY, Sun WC. 4. Noy N, McGuinness DL. Ontology development 101: A Automatic ontology construction approaches and its appli- guide to creating your first ontology. Technical Report; cations on military intelligence. Proceedings of Asia Pacific Stanford. 2001. Conference on Information Processing; 2009. p. 348–51. 5. Li YH, Jain AK. Classification of text documents. The 22. Khan L, Luo F. Ontology construction for information selec- Computer Journal. 1998; 41(8):537–46. tion. Proceedings of 14th IEEE International Conference 6. Revathy S, Parvaathavarthini B, Rajathi S. Futuristic valida- on Tools with ; Washington. 2002. p. tion method for rough . Indian Journal of 122–7. Science and Technology. 2015 Jan; 8(2):120–7. 23. Kietz JU, Maedche A, Volz R. A method for semi-auto- 7. Rao TVN, Govardhan A. Efficient segmentation and classi- matic ontology acquisition from a corporate intranet. fication of mammogram images with fuzzy filtering. Indian Proceedings of the EKAW’2000 Workshop on Ontology’s Journal of Science and Technology. 2015 Jul; 8(15):1–8. and Texts; France. 2000.

6 Vol 9 (28) | July 2016 | www.indjst.org Indian Journal of Science and Technology K. Selvi1 and R. M. Suresh

24. Sampath S, Murugan S. Similarity measure using fuzzy 26. Cimiano P, et al. Learning concept hierarchies from text formal concept for any context. International Journal of corpa using fuzzy concept. Analysis Journal of Artificial Advanced . 2013; 12(3):607–13. Intelligence Research. 2005; 24(1):205–339. 25. Bendaoud R, Mohamed RH, Yannick G. Text-based 27. Shey H, Kindervag J. Rethinking data discovery and data ontology construction using relational concept. Analysis classification. Strategic plan. The Data Security and privacy Workshop on Ontology Dynamics-IWOD 2007; France. playback. USA: Forrester Research Inc; 2014. 2007.

Vol 9 (28) | July 2016 | www.indjst.org Indian Journal of Science and Technology 7