Using Distributional Semantics for Automatic Taxonomy Induction

Total Page:16

File Type:pdf, Size:1020Kb

Using Distributional Semantics for Automatic Taxonomy Induction Using Distributional Semantics for Automatic Taxonomy Induction Bushra Zafar Michael Cochez Usman Qamar College of Electrical & Fraunhofer – FIT College of Electrical & Mechanical Engineering, Sankt Augustin, Germany Mechanical Engineering, National University of michael.cochez@fit.fraunhofer.de National University of Science and Technology Science and Technology Islamabad, Pakistan Informatik 5 Islamabad, Pakistan [email protected] RWTH Aachen University [email protected] DE-52056 Aachen, Germany Abstract—Semantic taxonomies are powerful tools that provide ontology learning, semantic relation classification, and so structured knowledge to Natural Language Processing (NLP), on [9]. A number of different theories and methodologies Information Retreval (IR), and general Artificial Intelligence have been proposed to induce taxonomy automatically. Very (AI) systems. These taxonomies are extensively used for solving knowledge rich problems such as textual entailment and question common are pattern-based approaches. The working principle answering. In this paper, we present a taxonomy induction system of these approaches is to define patterns or rules first and and evaluate it using the benchmarks provided in the Taxonomy find matching text in a large textual corpus. Often a form Extraction Evaluation (TExEval2) Task. The task is to identify of bootstrapping is used. The system iteratively learns more hyponym-hypernym relations and to construct a taxonomy from patterns from the corpus by deriving patterns from the results a given domain specific list. Our approach is based on a word embedding, trained from a large corpus and string-matching found in the previous round. In this paper we focus on a approaches. The overall approach is semi-supervised. We propose distributional semantics approach. We utilize the similarity a generic algorithm that utilizes the vectors from the embedding of the context in which words occur to infer the meaning effectively, to identify hyponym-hypernym relations and to induce of the words. The context of a word is usually defined as the taxonomy. The system generated taxonomies on English the set of words that often occur near in a large corpus. We language for three different domains (environment, food and science) which are evaluated against gold standard taxonomies. are also somewhat inspired by clustering approaches, which The system achieved good results for hyponym-hypernym iden- incorporate clustering based on word (or context) similarity. tification and taxonomy induction, especially when compared to In this paper we present a system which creates a taxonomy other tools using similar background knowledge. from a given set of words. This way, we can compare to the results of the SemEval – Taxonomy Extraction Evaluation I. INTRODUCTION (TExEval-2) workshop1. The presented framework is semi- Semantic taxonomies are powerful tools that provide struc- supervised and combines the strength of word-embedding, tured knowledge to Natural Language Processing (NLP), Infor- clustering, and string-matching to identify hyponym-hypernym mation Retreval (IR), and general Artificial Intelligence (AI) relations. systems. Semantic taxonomies are extensively used for solving The remainder of the paper is structured as follows. In the knowledge-rich-problems such as textual entailment [1] and next section we provide an overview of related work. Then, question answering [2]. There are a number of manually cre- in section III we introduce our own system. In section IV ated and curated taxonomies such as YAGO [3], Freebase [4], we present our evaluation and compare the results with other SUMO [5], etc. These taxonomies have excellent quality and systems. Then we finish with a short conclusion. cover a broad domain. However, manually curating a taxonomy incurs a high cost II. RELATED WORK as can be seen in taxonomies like WordNet [6] and more The process of automatic taxonomy construction, can be encompassing systems like Cyc [7]. Moreover, these systems split in two steps. First, one needs to obtain the terms which remain incomplete and it is sometimes difficult to add new have to be in the taxonomy. These can either be provided or terms because of contradictions. Further, customization for have to be identified and extract from a text corpus. Second, different domains can be cumbersome. Nowadays, partially these terms have to be placed in a hierarchy, forming the due to rapid technological development and availability of data taxonomy. Often these two steps are combined into one as sources, it has become feasible to construct semi-automatic, or in the pattern-based approaches which we will review first. even automatic, domain specific taxonomies, instead of creat- After that we will look into clustering, classification, graph, ing those manually. [8] The process of taxonomy construction goes by many names like automatic taxonomy induction, 1http://alt.qcri.org/semeval2016/task13/ and distributional approaches. Note that these approaches are using morpho-syntactic analyzes. She essentially proposed often combined, like for example Kozareva and Hovy [10] three identifying principles. First, in single word terms the who made a system based on patterns and graph algorithms suffix is considered a hypernym if the prefix is an existing for creating the taxonomy. term. For example, pin is a hypernym of candlepin. Second, Pattern-based (also sometimes called rule-based) approaches for multi word terms, a term t1 is hypernym of multi word form perhaps the most common class of automatic taxonomy term t2 if t2 contains t1 (as pioneered by Jones). Finally, induction tools. The core idea of these approaches is to prepositions in words are used. For example, the concept establish certain patterns and if these patterns are found in sociology of culture contains the preposition of and hence the corpus the system infers a hypernym–hyponym relation we would decide that sociology is a hypernym of culture between them. For example, if the pattern a [[noun1]] is (example from [8]). There are a few limitations associated a [[noun2]] (where the double rectangular braced words with string-based methods. The first one is that it can only be are placeholders for actual words in the text) is found, then applied to compound terms (e.g., the hypernym of chemistry one can conclude that noun2 is a hypernym of noun1. If cannot be found using a string-based method). The second these patterns are chosen wisely, these systems achieve a high one is that a string-based method cannot differentiate direct accuracy and often remain relatively simple in implementation. and indirect hypernyms. For instance, animal science is a life Obviously, this kind of patterns can be used for applications science, which in turn is a science. However, a string-based beyond hyponym–hypernym detection. Also other relations method will identify science as the hypernym of both. like part-of (meronomy), is-like (synonym), is born in, etc. Another large group of approaches are based on clustering. can be detected using patterns. For these approaches, word contexts are represented as vectors The creation and selection of patterns can be done either containing the values for selected features. Next, a similarity manually [11] or automatically using automatic bootstrap- between the vectors is defined and then the clustering is ping [12]. The bootstrapping technique starts from a limited performed [16], [17]. A number of features have been used set of man-made patterns and extracts words fitting the patterns to create these vectors such as noun-verb relations [18], co- from the corpus. Then, these extracted words are used to find occurrence [19], and so on. Using clustering-based approaches new rules from the corpus. If we, for example, using the rule it is possible to identify relations which do not occur explicitly above, learned that a cake is a dessert, then we can now in the text corpus. However, labeling of clusters is required go trough the corpus and detect other sentences containing in clustering-based approaches, which is difficult for taxon- cake and dessert. If we find the sentence “cake is my favorite omy induction and evaluation. Moreover, similarly to pattern- dessert”, we could instantiate the new rule [[noun1]] is based, the clustering-based approaches fail to produce coherent my favorite [[noun2]]. Now, this new rule can be clusters for small corpora, and hence do not produce good used to find more pairs of words and the process gets repeated. results. To our knowledge, clustering-based approaches have Already in early works pattern-based approach are used for only been used to identify and extract hyponym-hypernym (is- finding is-a relation from a large corpus. Hearst [12], for the a) and synonym (is-like) relations. first time, utilized a set of hyponym-hypernym patterns and Snow et al. [20] used sentences and parse trees for hy- performed bootstrapping to automatically find the is-a relation pernym identification. First, the sentences are extracted and from a corpus. These patterns are now known as Hearst-style if they contain two terms parse trees are used to extract patterns. Others have also used this pattern style like, for patterns. These patterns were consecutively used to train the instance, Mann [13] who used Hearst-style patterns to extract hypernym classifier to induce a taxonomy from text corpora. the is-a relation for proper nouns. Yang and Callan [9] introduced a semi-supervised approach However, a drawback of pattern-based approaches is the for taxonomy induction
Recommended publications
  • Using a Natural Language Understanding System to Generate Semantic Web Content
    Final version to appear, International Journal on Semantic Web and Information Systems, 3(4), 2007. (http://www.igi-pub.com/) 1 Using a Natural Language Understanding System to Generate Semantic Web Content Akshay Java, Sergei Nirneburg, Marjorie McShane, Timothy Finin, Jesse English, Anupam Joshi University of Maryland, Baltimore County, Baltimore MD 21250 {aks1,sergei,marge,finin,english1,joshi}@umbc.edu Abstract We describe our research on automatically generating rich semantic annotations of text and making it available on the Semantic Web. In particular, we discuss the challenges involved in adapting the OntoSem natural language processing system for this purpose. OntoSem, an implementation of the theory of ontological semantics under continuous development for over fifteen years, uses a specially constructed NLP-oriented ontology and an ontological- semantic lexicon to translate English text into a custom ontology-motivated knowledge representation language, the language of text meaning representations (TMRs). OntoSem concentrates on a variety of ambiguity resolution tasks as well as processing unexpected input and reference. To adapt OntoSem’s representation to the Semantic Web, we developed a translation system, OntoSem2OWL, between the TMR language into the Semantic Web language OWL. We next used OntoSem and OntoSem2OWL to support SemNews, an experimental web service that monitors RSS news sources, processes the summaries of the news stories and publishes a structured representation of the meaning of the text in the news story. Index Terms semantic web, OWL, RDF, natural language processing, information extraction I. INTRODUCTION A core goal of the development of the Semantic Web is to bring progressively more meaning to the information published on the Web.
    [Show full text]
  • Towards a Taxonomy of KOS: Dimensions for Classifying Knowledge Organization Systems I
    Renato Rocha Souza (Getulio Vargas Foundation, Brazil) Douglas Tudhope (University of Glamorgan, UK) Maurício Barcellos Almeida (Federal University of Minas Gerais, Brazil) Towards a taxonomy of KOS: Dimensions for classifying Knowledge Organization Systems i Abstract: This paper analyzes previous work on the classification of Knowledge Organization Systems (KOS), discusses strengths and weaknesses, and proposes a new and integrative framework. It argues that current analyses of the KOS tend to be idiosyncratic and incomplete, relying on a limited number of dimensions of analysis. The paper discusses why and how KOS should be classified on a new basis. Based on the available literature and previous work, the authors propose a wider set of dimensions for the analysis of KOS. These are represented in a taxonomy of KOS. Issues arising are discussed. 1. Representation and Knowledge Organization Systems 1 Although central to many scientific fields, or perhaps because of that, the process of representing knowledge is not a simple matter. There are many distinct theories, models, methodologies and products; all influenced by specific applications, backgrounds and purposes. Knowledge representation artifacts are produced in related fields, such as Artificial Intelligence, Semiotics, Computer Science and Cognitive Science; but are also widely used in a myriad of less related areas, such as Education, Mathematics, Business Modeling, Linguistics, and many more. The field of Library and Information Science (LIS), however, considers these representational artifacts as a paramount matter. In fact, LIS takes upon itself the task of organizing and facilitating the retrieval of the wealth of information that arises from the knowledge produced in all other fields, and this involves the creation of epistemological and ontological surrogates.
    [Show full text]
  • Towards a Semantic Web for Culture
    Towards a Semantic Web for Culture Kim H. Veltman Maastricht McLuhan Institute PO Box 616 Maastricht MD 6200, Netherlands [email protected] Contents 1. Introduction 1.1. World Wide Web 1.2. Sowa’s Knowledge Representation 1.3. Five Issues 1.4. Greek Principles 1.5. Causes, Substance, and Accidents 2. Substance to Function 2.1 Ontology to Systematics 3. Definitions 3.1. Semiotics and Linguistics 4. Words versus Concepts 5. Relations 5.1. Formal Relationships 5.2. Form-Categorical Relationships 5.3. Material Concept Relationships 5.3.1. Generic or Abstraction 5.3.2. Partition 5.3.3. Opposition or Complementary 5.3.4. Functional Relationship or Syntax 6. Relations, Universals and Particulars 7. Dynamic Meaning 7.1. Cultural and Historical Dimensions of Knowledge 7.2. Recent Developments 8. Semantics 8.1. Relational Databases 8.2. Semantic Web and Semantic Networks 9. Practical Consequences 10. Conclusions Appendices 1. Changing World Views 2. From Inanimate Being to Mineralogy and Chemistry 3. Mineral Classification Systems 4. From Animate Being to Taxonomy 5. Perreault’s Classifications of Relations 1 Abstract Today’s semantic web deals with meaning in a very restricted sense and offers static solutions. This is adequate for many scientific, technical purposes and for business transactions requiring machine-to-machine communication, but does not answer the needs of culture. Science, technology and business are concerned primarily with the latest findings, the state of the art, i.e. the paradigm or dominant world-view of the day. In this context, history is considered non-essential because it deals with things that are out of date.
    [Show full text]
  • Text Understanding Agents and the Semantic
    1 Text understanding agents and the Semantic Web Akshay Java, Tim Finin and Sergei Nirenburg University of Maryland, Baltimore County, Baltimore MD 21250 faks1,finin,[email protected] Abstract— We discuss the challenges involved in adapting the is as documents and annotations encoded in Semantic Web OntoSem natural language processing system to the Web. One languages such as RDF and OWL. This will make the vast set of tasks involves processing Web documents, translating amount of information found in text documents on the Web their computed meaning representations from the OntoSem’s native KR language into the Semantic Web language OWL, and more readily and efficiently available to a large community of publishing the results as Web pages and RSS feeds. Another software agents. set of tasks works in reverse – querying the Web for facts At the same time, language understanding agents can di- needed by OntoSem, translating them from OWL into OntoSem’s rectly use information found on the web encoded in OWL native KR language and importing the results. A central problem to help guide their language understanding tasks. NLP sys- underlying both sets of tasks is that of translating knowledge between OntoSem’s KR language and ontologies and those of tems require not only ontological knowledge (e.g., “A city the Semantic Web. OntoSem2OWL has been developed as a is a geopolitical region”) and lexical knowledge (e.g., “city translation system to support these translations. We describe denotes a city”) but also a considerable body of facts. Such SemNews, an implemented prototype application that demon- information, often called encyclopedic knowledge, includes strates the process.
    [Show full text]
  • Ontology Matching
    Open Information Systems 2019-2020 Lecture 9: Ontology Matching Christophe Debruyne While ontologies and KBs are different, we will – in this lecture – use the term “ontology” for both unless otherwise explicitly specified. Semantic Heterogeneity Semantic heterogeneity are variations in meaning or ambiguity in interpretations [1], which remains an challenge in database research (and other CS disciplines). Semantic heterogeneity can be caused by: • Ontologies sharing concepts, yet built for Example from the GLAM sector: – different tasks or Dublin Core: names of people – by different communities are literals CIDOC CRM: names are instances – resulting in variation in scope, structure, of Appellation identifying and granularity instances of Person, and relationships between appellations can be declared. 8/24/20 [email protected] 2 Semantic Heterogeneity Semantic heterogeneity can be caused by: • The use of different ontology (or representation) languages: OWL vs. SQL • Using different terminologies – within the same language (e.g., jargon and synonyms) and – across languages (also note language variation). • Ontologies evolving over time (versioning) • … 8/24/20 [email protected] 3 The Ontology Matching Problem Semantic heterogeneity is typically tackled in two steps: 1. detecting the correspondences between the different ontologies, which constitute an alignment, and 2. interpreting these correspondences to create an executable mapping with respect to the application needs. Why Ontology Matching (OM)? OM techniques are useful for : • Schema & data integration [ integrating data w/ common schema ] • Data interlinking [ linking instances ] • Data transformation [ from one representation to another ] • Query answering [ via a common ontology ] • … 8/24/20 [email protected] 4 The Ontology Matching Problem The Matching Problem is not new, it has been studies in the context of data warehousing and data integration.
    [Show full text]
  • Content Ontology Design Patterns Qualities, Methods, and Tools
    “Dissertation” — 2017/8/15 — 10:53 — page i — #1 Linkoping¨ Studies in Science and Technology Dissertaons. No. 1879 Content Ontology Design Paerns: Qualies, Methods, and Tools by Karl Hammar Linkoping¨ University Department of Computer and Informaon Science Division of Human-Centered Systems SE-581 83 Linkoping,¨ Sweden Linkoping¨ 2017 \Dissertation" | 2017/8/15 | 10:53 | page ii | #2 © 2017 Karl Hammar Cover photograph by Jenny Hammar ISBN: 978-91-7685-454-9 ISSN: 0345{7524 URL: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139584/ Printed by LiU Tryck, Link¨oping2017 \Dissertation" | 2017/8/15 | 10:53 | page iii | #3 Abstract Ontologies are formal knowledge models that describe concepts and rela- tionships and enable data integration, information search, and reasoning. Ontology Design Patterns (ODPs) are reusable solutions intended to sim- plify ontology development and support the use of semantic technologies by ontology engineers. ODPs document and package good modelling practices for reuse, ideally enabling inexperienced ontologists to construct high-quality ontologies. Although ODPs are already used for development, there are still remaining challenges that have not been addressed in the literature. These research gaps include a lack of knowledge about (1) which ODP features are important for ontology engineering, (2) less experienced developers' prefer- ences and barriers for employing ODP tooling, and (3) the suitability of the eXtreme Design (XD) ODP usage methodology in non-academic contexts. This dissertation aims to close these gaps by combining quantitative and qualitative methods, primarily based on five ontology engineering projects involving inexperienced ontologists. A series of ontology engineering work- shops and surveys provided data about developer preferences regarding ODP features, ODP usage methodology, and ODP tooling needs.
    [Show full text]
  • Using a Natural Language Understanding System to Generate Semantic Web Content
    50 Int’l Journal on Semantic Web & Information Systems, 3(4), 50-74, October-December 2007 Using a Natural Language Understanding System to Generate Semantic Web Content Akshay Java, University of Maryland, USA Sergei Nirneburg, University of Maryland, USA Marjorie McShane, University of Maryland, USA Timothy Finin, University of Maryland, USA Jesse English, University of Maryland, USA Anupam Joshi, University of Maryland, USA ABSTRACT We describe our research on automatically generating rich semantic annotations of text and making it available on the Semantic Web. In particular, we discuss the challenges involved in adapting the On- toSem natural language processing system for this purpose. OntoSem, an implementation of the theory of ontological semantics under continuous development for over 15 years, uses a specially constructed NLP-oriented ontology and an ontological-semantic lexicon to translate English text into a custom ontol- ogy-motivated knowledge representation language, the language of text meaning representations (TMRs). OntoSem concentrates on a variety of ambiguity resolution tasks as well as processing unexpected input and reference. To adapt OntoSem’s representation to the Semantic Web, we developed a translation system, OntoSem2OWL, between the TMR language into the Semantic Web language OWL. We next used OntoSem and OntoSem2OWL to support SemNews, an experimental Web service that monitors RSS news sources, processes the summaries of the news stories, and publishes a structured representation of the meaning of the text in the news story. Keywords: information extraction; natural language processing; OWL; RDF; Semantic Web INTRODUCTION annotation is time-consuming and error-prone. A core goal of the development of the Semantic Moreover, annotations must be made in a formal Web is to bring progressively more meaning to language whose use may require considerable the information published on the Web.
    [Show full text]
  • Mereology, Conceptual Modelling and Mathematical Aspects
    KRDB RESEARCH CENTRE KNOWLEDGE REPRESENTATION MEETS DATABASES Faculty of Computer Science, Free University of Bozen-Bolzano, Piazza Domenicani 3, 39100 Bolzano, Italy Tel: +39 04710 16000, fax: +39 04710 16009, http://www.inf.unibz.it/krdb/ KRDB Research Centre Technical Report: Introduction to part-whole relations: mereology, conceptual modelling and mathematical aspects C. Maria Keet Affiliation KRDB Research Centre, Faculty of Computer Science Free University of Bozen-Bolzano Piazza Domenicani 3, 39100 Bolzano, Italy Corresponding author Maria Keet [email protected] Keywords Mereology, Aggregation, Part-whole relation, DL, ORM, UML Number KRDB06-3 Date 2 October 2006 URL http://www.inf.unibz.it/krdb/pub/ c KRDB Research Centre This work may not be copied or reproduced in whole or part for any commercial purpose. Permission to copy in whole or part without payment of fee is granted for non-profit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of the KRDB Research Centre, Free University of Bozen-Bolzano, Italy; an acknowledgement of the authors and individual contributors to the work; all applicable portions of this copyright notice. Copying, reproducing, or republishing for any other purpose shall require a licence with payment of fee to the KRDB Research Centre. Introduction to part-whole relations: mereology, conceptual modelling and mathematical aspects? Maria Keet KRDB Research Centre, Faculty of Computer Science, Free University of Bozen-Bolzano, Italy [email protected] 1 Introduction An active area of research in knowledge representation concerns the nature of the part- whole relation between entities or instances, its representation in a (formal) language, and its translation to support in software systems.
    [Show full text]
  • Semantic Relatedness Measures for Identifying Relationships in Product
    Center for e-Design Proceedings Center for e-Design 8-30-2009 Semantic Relatedness Measures for Identifying Relationships in Product Development Processes Paul Witherell University of Massachusetts Amherst, [email protected] Sundar Krishnamurty University of Massachusetts Amherst, [email protected] Ian R. Grosse University of Massachusetts Amherst, [email protected] Jack C. Wileden University of Massachusetts Amherst, [email protected] Follow this and additional works at: http://lib.dr.iastate.edu/edesign_conf Part of the Industrial Engineering Commons, and the Programming Languages and Compilers Commons Recommended Citation Witherell, Paul; Krishnamurty, Sundar; Grosse, Ian R.; and Wileden, Jack C., "Semantic Relatedness Measures for Identifying Relationships in Product Development Processes" (2009). Center for e-Design Proceedings. 6. http://lib.dr.iastate.edu/edesign_conf/6 This Conference Proceeding is brought to you for free and open access by the Center for e-Design at Iowa State University Digital Repository. It has been accepted for inclusion in Center for e-Design Proceedings by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Semantic Relatedness Measures for Identifying Relationships in Product Development Processes Abstract The eS mantic Web, especially in relation to ontologies, provides a structured, formal framework for knowledge interoperability. This trait has been exploited by both the biomedical community in development of the Human Gene Ontology [1] and also by geographers in development of geospatial ontologies [2]. Using semantic relatedness techniques, researchers from both communities have been able to develop and integrate comprehensive knowledge bases. Beyond knowledge integration, semantic relatedness techniques have also been able to provide each community with a unique insight into relationships between concepts in their respective domains.
    [Show full text]
  • Towards an Ontology Design Pattern Quality Model
    Link¨oping Studies in Science and Technology Licentiate Thesis No. 1606 Towards an Ontology Design Pattern Quality Model by Karl Hammar Department of Computer and Information Science Link¨oping University SE-581 83 Link¨oping, Sweden Link¨oping 2013 This is a Swedish Licentiate’s Thesis Swedish postgraduate education leads to a doctor’s degree and/or a licentiate’s degree. A doctor’s degree comprises 240 ECTS credits (4 year of full-time studies). A licentiate’s degree comprises 120 ECTS credits. Copyright c 2013 Karl Hammar ISBN 978-91-7519-570-4 ISSN 0280–7971 Printed by LiU Tryck 2013 URL: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-93370 Towards an Ontology Design Pattern Quality Model by Karl Hammar September 2013 ISBN 978-91-7519-570-4 Link¨oping Studies in Science and Technology Licentiate Thesis No. 1606 ISSN 0280–7971 LiU–Tek–Lic–2013:40 ABSTRACT The use of semantic technologies and Semantic Web ontologies in particular have enabled many recent developments in information integration, search engines, and reasoning over formalised knowledge. Ontology Design Patterns have been proposed to be useful in sim- plifying the development of Semantic Web ontologies by codifying and reusing modelling best practices. This thesis investigates the quality of Ontology Design Patterns. The main contribution of the thesis is a theoretically grounded and partially empirically evaluated quality model for such patterns including a set of quality characteristics, indicators, measurement methods and recommendations. The quality model is based on established theory on information system quality, conceptual model quality, and ontology evaluation.
    [Show full text]
  • The Headed Meronomy As a Ladder to Complex Thought
    The headed meronomy as a ladder to complex thought Chris Thornton Centre for Research in Cognitive Science University of Sussex Brighton BN1 9QJ UK [email protected] August 14, 2017 Abstract How changing patterns of human behavior reveal (or stem from) cog- nitive changes has long been of interest. A recent archaeological project has attempted to understand evidence taken from the neolithic site of C¸atalh¨oy¨uk in relation to the emergence of consciousness and creativity at the dawn of settled life. In practice, our understanding of concsiousness and creativity defined broadly is quite limited. We do, however, have a rel- atively precise understanding of what conceptual creativity (the creative construction of new ideas) must involve. There is the potential to bring this more specific understanding of mental processes to bear. This article reviews some relevant principles of conceptual creativity, and considers how they might be used in decoding the evidence from C¸atalh¨oy¨uk. Keywords: conceptual creativity, ontology, meronomy 1 Introduction There is a natural desire to explain the development of cognitive phenomena that seem particularly characteristic of the human mind. Creativity is a case in point. Consciousness is another. A recent archaeological project has attempted to understand evidence taken from the neolithic site of C¸atalh¨oy¨uk in relation to emergence of both at the dawn of settled life. This project faces the need to settle on a precise understanding of what is meant by the terms in question. But theorists differ as to what this understanding should be. And there are deeper disputes relating to cognition in general.
    [Show full text]
  • Automatic Generation of Concept Taxonomies from Web Search Data Using Support Vector Machine
    AUTOMATIC GENERATION OF CONCEPT TAXONOMIES FROM WEB SEARCH DATA USING SUPPORT VECTOR MACHINE Robertas Damaševičius Software Engineering Department, Kaunas University of Technology Studentų 50, LT-51368, Kaunas, Lithuania Keywords: Taxonomy learning, Web mining, Data mining, Machine learning, Support Vector Machine. Abstract: Ontologies and concept taxonomies are essential parts of the Semantic Web infrastructure. Since manual construction of taxonomies requires considerable efforts, automated methods for taxonomy construction should be considered. In this paper, an approach for automatic derivation of concept taxonomies from web search results is presented. The method is based on generating derivative features from web search data and applying the machine learning techniques. The Support Vector Machine (SVM) classifier is trained with known concept hyponym-hypernym pairs and the obtained classification model is used to predict new hyponymy (is-a) relations. Prediction results are used to generate concept taxonomies in OWL. The results of the application of the approach for constructing colour taxonomy are presented. 1 INTRODUCTION classes, their instances and attributes. The central components of ontologies are taxonomies, which The Semantic Web is a vision for the future of the define only taxonomical relationships between Web in which information is given explicit meaning, concepts. In fact, many ontology development which makes it easier for machines to automatically methodologies such as METHONTOLOGY process, interpret and integrate
    [Show full text]