Natural Language Processing Week 03 Contents

Total Page:16

File Type:pdf, Size:1020Kb

Natural Language Processing Week 03 Contents Natural language processing Week 03 Contents 1 Information extraction 1 1.1 History ................................................. 1 1.2 Present significance .......................................... 2 1.3 Tasks and subtasks ........................................... 2 1.4 World Wide Web applications ..................................... 3 1.5 Approaches .............................................. 3 1.6 Free or open source software and services ............................... 4 1.7 Commercial software and services .................................. 4 1.8 See also ................................................ 4 1.9 References ............................................... 5 1.10 External links ............................................. 5 2 Named-entity recognition 6 2.1 Problem definition ........................................... 6 2.1.1 Formal evaluation ....................................... 7 2.2 Approaches .............................................. 7 2.3 Problem domains ........................................... 7 2.4 Current challenges and research .................................... 8 2.5 Software ................................................ 8 2.6 See also ................................................ 8 2.7 References ............................................... 8 2.8 External links ............................................. 10 3 Part-of-speech tagging 11 3.1 Principle ................................................ 11 3.2 History ................................................. 11 3.2.1 The Brown Corpus ...................................... 12 3.2.2 Use of hidden Markov models ................................ 12 3.2.3 Dynamic programming methods ............................... 12 3.2.4 Unsupervised taggers ..................................... 13 3.2.5 Other taggers and methods .................................. 13 3.3 Issues ................................................. 13 3.4 See also ................................................ 14 i ii CONTENTS 3.5 References ............................................... 14 3.6 External links ............................................. 15 4 Phrase chunking 16 4.1 See also ................................................ 16 4.2 External links ............................................. 16 5 Relationship extraction 17 5.1 Applications .............................................. 17 5.2 Approaches .............................................. 17 5.3 See also ................................................ 17 5.4 References ............................................... 17 6 Sentence boundary disambiguation 19 6.1 Strategies ............................................... 19 6.2 Software ................................................ 19 6.3 See also ................................................ 20 6.4 References ............................................... 20 6.5 External links ............................................. 20 7 Shallow parsing 21 7.1 References .............................................. 21 7.2 External links ............................................. 21 7.3 See also ................................................ 21 8 Stemming 22 8.1 Examples ............................................... 22 8.2 History ................................................. 22 8.3 Algorithms ............................................... 22 8.3.1 The production technique ................................... 23 8.3.2 Suffix-stripping algorithms .................................. 23 8.3.3 Lemmatisation algorithms .................................. 24 8.3.4 Stochastic algorithms ..................................... 24 8.3.5 n-gram analysis ........................................ 24 8.3.6 Hybrid approaches ...................................... 24 8.3.7 Affix stemmers ........................................ 25 8.3.8 Matching algorithms ..................................... 25 8.4 Language challenges .......................................... 25 8.4.1 Multilingual stemming .................................... 25 8.5 Error metrics ............................................. 25 8.6 Applications .............................................. 26 8.6.1 Information retrieval ..................................... 26 8.6.2 Domain Analysis ....................................... 26 CONTENTS iii 8.6.3 Use in commercial products ................................. 26 8.7 See also ................................................ 26 8.8 References ............................................... 27 8.9 Further reading ............................................ 27 8.10 External links ............................................. 28 9 Text segmentation 30 9.1 Segmentation problems ........................................ 30 9.1.1 Word segmentation ...................................... 30 9.1.2 Sentence segmentation .................................... 30 9.1.3 Topic segmentation ...................................... 31 9.1.4 Other segmentation problems ................................ 31 9.2 Automatic segmentation approaches ................................. 31 9.3 See also ................................................ 31 9.4 References .............................................. 32 9.5 External links ............................................. 32 10 Tokenization (lexical analysis) 33 10.1 Methods and obstacles ......................................... 33 10.2 Software ................................................ 33 10.3 See also ................................................ 34 10.4 References ............................................... 34 11 Parsing 35 11.1 Human languages ........................................... 35 11.1.1 Traditional methods ...................................... 35 11.1.2 Computational methods .................................... 36 11.1.3 Psycholinguistics ....................................... 36 11.2 Computer languages ......................................... 36 11.2.1 Parser ............................................. 36 11.2.2 Overview of process ..................................... 37 11.3 Types of parsers ............................................ 38 11.4 Parser development software ..................................... 38 11.5 Lookahead .............................................. 39 11.6 See also ................................................ 40 11.7 References .............................................. 41 11.8 Further reading ............................................ 41 11.9 External links ............................................. 41 12 Parse tree 43 12.1 Constituency-based parse trees .................................... 43 12.2 Dependency-based parse trees ..................................... 44 12.3 Phrase markers ............................................ 44 iv CONTENTS 12.4 See also ................................................ 45 12.5 Notes ................................................. 45 12.6 References ............................................... 45 12.7 External links ............................................. 45 13 Constituent (linguistics) 46 13.1 Constituency tests ........................................... 46 13.1.1 Topicalization (fronting) ................................... 46 13.1.2 Clefting ............................................ 46 13.1.3 Pseudoclefting ........................................ 47 13.1.4 Pro-form substitution (replacement) ............................. 47 13.1.5 Answer ellipsis (answer fragments, question test) ....................... 47 13.1.6 Passivization ......................................... 47 13.1.7 Omission (deletion) ...................................... 47 13.1.8 Coordination ......................................... 48 13.2 Constituency tests and disambiguation ................................. 48 13.3 Competing theories .......................................... 48 13.4 See also ................................................ 49 13.5 Notes ................................................. 49 13.6 References ............................................... 50 14 Dependency grammar 52 14.1 History ................................................. 52 14.2 Dependency vs. constituency ..................................... 52 14.3 Dependency grammars ........................................ 53 14.4 Representing dependencies ...................................... 54 14.5 Types of dependencies ........................................ 55 14.5.1 Semantic dependencies .................................... 55 14.5.2 Morphological dependencies ................................. 56 14.5.3 Prosodic dependencies .................................... 57 14.5.4 Syntactic dependencies .................................... 57 14.6 Linear order and discontinuities .................................... 58 14.7 Syntactic functions ........................................... 60 14.8 See also ................................................ 60 14.9 Notes ................................................. 61 14.10References ............................................... 62 14.11External links ............................................. 63 15 Phrase structure grammar 64 15.1 Constituency relation ......................................... 64 15.2 Dependency relation .......................................... 65 15.3 Non-descript grammars ........................................ 65 CONTENTS v 15.4 See also ................................................ 65 15.5 Notes ................................................
Recommended publications
  • A Protein Interaction Extraction Systemusing a Link Grammar Parser from Biomedical Abstracts
    World Academy of Science, Engineering and Technology International Journal of Biomedical and Biological Engineering Vol:1, No:5, 2007 PIELG: A Protein Interaction Extraction System using a Link Grammar Parser from Biomedical Abstracts Rania A. Abul Seoud, Nahed H. Solouma, Abou-Baker M. Youssef, and Yasser M. Kadah, Senior Member, IEEE failure. Applications that repair or replace portions of or Abstract—Due to the ever growing amount of publications about whole living tissues (e.g., bone, dentine, or bladder) using protein-protein interactions, information extraction from text is living cells is named Tissue Engineering (TE). For example, increasingly recognized as one of crucial technologies in dentine formation is the process of regenerating dental tissues bioinformatics. This paper presents a Protein Interaction Extraction by tissue engineering principles and technology. Dentine System using a Link Grammar Parser from biomedical abstracts formation is governed by biological mediators or growth (PIELG). PIELG uses linkage given by the Link Grammar Parser to start a case based analysis of contents of various syntactic roles as factors (protein) and interactions amongst different proteins. well as their linguistically significant and meaningful combinations. Dentine formation needs the support of continuous updated The system uses phrasal-prepositional verbs patterns to overcome information about protein-protein interactions. preposition combinations problems. The recall and precision are Researches in the last decade have resulted in the 74.4% and 62.65%, respectively. Experimental evaluations with two production of a large amount of information about protein other state-of-the-art extraction systems indicate that PIELG system functions involved in dentine formation process. That achieves better performance.
    [Show full text]
  • Enhanced Thesaurus Terms Extraction for Document Indexing
    Enhanced Thesaurus Terms Extraction for Document Indexing Frane ari¢, Jan najder, Bojana Dalbelo Ba²i¢, Hrvoje Ekli¢ Faculty of Electrical Engineering and Computing, University of Zagreb Unska 3, 10000 Zagreb, Croatia E-mail:{Frane.Saric, Jan.Snajder, Bojana.Dalbelo, Hrvoje.Eklic}@fer.hr Abstract. In this paper we present an mogeneous due to diverse background knowl- enhanced method for the thesaurus term edge and expertise of human indexers. The extraction regarded as the main support to task of building semi-automatic and auto- a semi-automatic indexing system. The matic systems, which aim to decrease the enhancement is achieved by neutralising burden of work borne by indexers, has re- the eect of language morphology applying cently attracted interest in the research com- lemmatisation on both the text and the munity [4], [13], [14]. Automatic indexing thesaurus, and by implementing an ecient systems still do not achieve the performance recursive algorithm for term extraction. of human indexers, so semi-automatic sys- Formal denition and statistical evaluation tems are widely used (CINDEX, MACREX, of the experimental results of the proposed MAI [10]). method for thesaurus term extraction are In this paper we present a method for the- given. The need for disambiguation methods saurus term extraction regarded as the main and the eect of lemmatisation in the realm support to semi-automatic indexing system. of thesaurus term extraction are discussed. Term extraction is a process of nding all ver- batim occurrences of all terms in the text. Keywords. Information retrieval, term Our method of term extraction is a part of extraction, NLP, lemmatisation, Eurovoc.
    [Show full text]
  • Implementing a Portable Clinical NLP System with a Common Data Model: a Lisp Perspective
    Implementing a portable clinical NLP system with a common data model: a Lisp perspective The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation Luo, Yuan, and Peter Szolovits, "Implementing a portable clinical NLP system with a common data model: a Lisp perspective." Proceedings, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2018), December 3-6, 2018, Madrid, Spain (Piscataway, N.J.: IEEE, 2018): doi 10.1109/BIBM.2018.8621521 ©2018 Author(s) As Published 10.1109/BIBM.2018.8621521 Publisher Institute of Electrical and Electronics Engineers (IEEE) Version Original manuscript Citable link https://hdl.handle.net/1721.1/124439 Terms of Use Creative Commons Attribution-Noncommercial-Share Alike Detailed Terms http://creativecommons.org/licenses/by-nc-sa/4.0/ Implementing a Portable Clinical NLP System with a Common Data Model – a Lisp Perspective Yuan Luo* Peter Szolovits* Dept. of Preventive Medicine CSAIL Northwestern University MIT Chicago, USA Cambridge, USA [email protected] [email protected] Abstract— This paper presents a Lisp architecture for a annotations, often in idiosyncratic representations. This makes portable NLP system, termed LAPNLP, for processing clinical it quite difficult to chain together sequences of operations. Alt- notes. LAPNLP integrates multiple standard, customized and hough several recent projects have achieved reasonable in-house developed NLP tools. Our system facilitates portability success in analyzing certain types of clinical narratives [3-6], across different institutions and data systems by incorporating an enriched Common Data Model (CDM) to standardize neces- efforts towards a common data model (CDM) to ensure port- sary data elements.
    [Show full text]
  • Ece351 Lab Manual
    DEREK RAYSIDE & ECE351 STAFF ECE351 LAB MANUAL UNIVERSITYOFWATERLOO 2 derek rayside & ece351 staff Copyright © 2014 Derek Rayside & ECE351 Staff Compiled March 6, 2014 acknowledgements: • Prof Paul Ward suggested that we look into something with vhdl to have synergy with ece327. • Prof Mark Aagaard, as the ece327 instructor, consulted throughout the development of this material. • Prof Patrick Lam generously shared his material from the last offering of ece251. • Zhengfang (Alex) Duanmu & Lingyun (Luke) Li [1b Elec] wrote solutions to most labs in txl. • Jiantong (David) Gao & Rui (Ray) Kong [3b Comp] wrote solutions to the vhdl labs in antlr. • Aman Muthrej and Atulan Zaman [3a Comp] wrote solutions to the vhdl labs in Parboiled. • TA’s Jon Eyolfson, Vajih Montaghami, Alireza Mortezaei, Wenzhu Man, and Mohammed Hassan. • TA Wallace Wu developed the vhdl labs. • High school students Brian Engio and Tianyu Guo drew a number of diagrams for this manual, wrote Javadoc comments for the code, and provided helpful comments on the manual. Licensed under Creative Commons Attribution-ShareAlike (CC BY-SA) version 2.5 or greater. http://creativecommons.org/licenses/by-sa/2.5/ca/ http://creativecommons.org/licenses/by-sa/3.0/ Contents 0 Overview 9 Compiler Concepts: call stack, heap 0.1 How the Labs Fit Together . 9 Programming Concepts: version control, push, pull, merge, SSH keys, IDE, 0.2 Learning Progressions . 11 debugger, objects, pointers 0.3 How this project compares to CS241, the text book, etc. 13 0.4 Student work load . 14 0.5 How this course compares to MIT 6.035 .......... 15 0.6 Where do I learn more? .
    [Show full text]
  • NLP Commercialisation in the Last 25 Years
    Natural Language Engineering (2019), 25, pp. 419–426 doi:10.1017/S1351324919000135 Anniversary INDUSTRY WATCH NLP commercialisation in the last 25 years Robert Dale∗ Language Technology Group ∗Corresponding author. Email: [email protected] Abstract The Journal of Natural Language Engineering is now in its 25th year. The editorial preface to the first issue emphasised that the focus of the journal was to be on the practical application of natural language processing (NLP) technologies: the time was ripe for a serious publication that helped encourage research ideas to find their way into real products. The commercialisation of NLP technologies had already started by that point, but things have advanced tremendously over the last quarter-century. So, to celebrate the journal’s anniversary, we look at how commercial NLP products have developed over the last 25 years. 1. Some context For many researchers, work in natural language processing (NLP) has a dual appeal. On the one hand, the computational modelling of language understanding or language production has often been seen as means of exploring theoretical questions in both linguistics and psycholinguistics; the general argument being that, if you can build a computational model of some phenomenon, then you have likely moved some way towards an understanding of that phenomenon. On the other hand, the scope for practical applications of NLP technologies has always been enticing: the idea that we could build truly useful computational artifacts that work with human language goes right back to the origins of the field in the early machine translation experiments of the 1950s. However, it was in the early 1990s that commercial applications of NLP really started to flourish, pushed forward in particular by targeted research in both the USA, much of it funded by the Defense Advanced Research Projects Agency (DARPA) via programs like the Message Understanding Conferences (MUC), and Europe, via a number of large-scale forward-looking EU-funded research programs.
    [Show full text]
  • A Way with Words: Recent Advances in Lexical Theory and Analysis: a Festschrift for Patrick Hanks
    A Way with Words: Recent Advances in Lexical Theory and Analysis: A Festschrift for Patrick Hanks Gilles-Maurice de Schryver (editor) (Ghent University and University of the Western Cape) Kampala: Menha Publishers, 2010, vii+375 pp; ISBN 978-9970-10-101-6, e59.95 Reviewed by Paul Cook University of Toronto In his introduction to this collection of articles dedicated to Patrick Hanks, de Schryver presents a quote from Atkins referring to Hanks as “the ideal lexicographer’s lexicogra- pher.” Indeed, Hanks has had a formidable career in lexicography, including playing an editorial role in the production of four major English dictionaries. But Hanks’s achievements reach far beyond lexicography; in particular, Hanks has made numerous contributions to computational linguistics. Hanks is co-author of the tremendously influential paper “Word association norms, mutual information, and lexicography” (Church and Hanks 1989) and maintains close ties to our field. Furthermore, Hanks has advanced the understanding of the relationship between words and their meanings in text through his theory of norms and exploitations (Hanks forthcoming). The range of Hanks’s interests is reflected in the authors and topics of the articles in this Festschrift; this review concentrates more on those articles that are likely to be of most interest to computational linguists, and does not discuss some articles that appeal primarily to a lexicographical audience. Following the introduction to Hanks and his career by de Schryver, the collection is split into three parts: Theoretical Aspects and Background, Computing Lexical Relations, and Lexical Analysis and Dictionary Writing. 1. Part I: Theoretical Aspects and Background Part I begins with an unfinished article by the late John Sinclair, in which he begins to put forward an argument that multi-word units of meaning should be given the same status in dictionaries as ordinary headwords.
    [Show full text]
  • Validating LR(1) Parsers
    Validating LR(1) Parsers Jacques-Henri Jourdan1;2, Fran¸coisPottier2, and Xavier Leroy2 1 Ecole´ Normale Sup´erieure 2 INRIA Paris-Rocquencourt Abstract. An LR(1) parser is a finite-state automaton, equipped with a stack, which uses a combination of its current state and one lookahead symbol in order to determine which action to perform next. We present a validator which, when applied to a context-free grammar G and an automaton A, checks that A and G agree. Validating the parser pro- vides the correctness guarantees required by verified compilers and other high-assurance software that involves parsing. The validation process is independent of which technique was used to construct A. The validator is implemented and proved correct using the Coq proof assistant. As an application, we build a formally-verified parser for the C99 language. 1 Introduction Parsing remains an essential component of compilers and other programs that input textual representations of structured data. Its theoretical foundations are well understood today, and mature technology, ranging from parser combinator libraries to sophisticated parser generators, is readily available to help imple- menting parsers. The issue we focus on in this paper is that of parser correctness: how to obtain formal evidence that a parser is correct with respect to its specification? Here, following established practice, we choose to specify parsers via context-free grammars enriched with semantic actions. One application area where the parser correctness issue naturally arises is formally-verified compilers such as the CompCert verified C compiler [14]. In- deed, in the current state of CompCert, the passes that have been formally ver- ified start at abstract syntax trees (AST) for the CompCert C subset of C and extend to ASTs for three assembly languages.
    [Show full text]
  • Finetuning Pre-Trained Language Models for Sentiment Classification of COVID19 Tweets
    Technological University Dublin ARROW@TU Dublin Dissertations School of Computer Sciences 2020 Finetuning Pre-Trained Language Models for Sentiment Classification of COVID19 Tweets Arjun Dussa Technological University Dublin Follow this and additional works at: https://arrow.tudublin.ie/scschcomdis Part of the Computer Engineering Commons Recommended Citation Dussa, A. (2020) Finetuning Pre-trained language models for sentiment classification of COVID19 tweets,Dissertation, Technological University Dublin. doi:10.21427/fhx8-vk25 This Dissertation is brought to you for free and open access by the School of Computer Sciences at ARROW@TU Dublin. It has been accepted for inclusion in Dissertations by an authorized administrator of ARROW@TU Dublin. For more information, please contact [email protected], [email protected]. This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License Finetuning Pre-trained language models for sentiment classification of COVID19 tweets Arjun Dussa A dissertation submitted in partial fulfilment of the requirements of Technological University Dublin for the degree of M.Sc. in Computer Science (Data Analytics) September 2020 Declaration I certify that this dissertation which I now submit for examination for the award of MSc in Computing (Data Analytics), is entirely my own work and has not been taken from the work of others save and to the extent that such work has been cited and acknowledged within the test of my work. This dissertation was prepared according to the regulations for postgraduate study of the Technological University Dublin and has not been submitted in whole or part for an award in any other Institute or University.
    [Show full text]
  • An Evaluation of Machine Learning Approaches to Natural Language Processing for Legal Text Classification
    Imperial College London Department of Computing An Evaluation of Machine Learning Approaches to Natural Language Processing for Legal Text Classification Supervisors: Author: Prof Alessandra Russo Clavance Lim Nuri Cingillioglu Submitted in partial fulfillment of the requirements for the MSc degree in Computing Science of Imperial College London September 2019 Contents Abstract 1 Acknowledgements 2 1 Introduction 3 1.1 Motivation .................................. 3 1.2 Aims and objectives ............................ 4 1.3 Outline .................................... 5 2 Background 6 2.1 Overview ................................... 6 2.1.1 Text classification .......................... 6 2.1.2 Training, validation and test sets ................. 6 2.1.3 Cross validation ........................... 7 2.1.4 Hyperparameter optimization ................... 8 2.1.5 Evaluation metrics ......................... 9 2.2 Text classification pipeline ......................... 14 2.3 Feature extraction ............................. 15 2.3.1 Count vectorizer .......................... 15 2.3.2 TF-IDF vectorizer ......................... 16 2.3.3 Word embeddings .......................... 17 2.4 Classifiers .................................. 18 2.4.1 Naive Bayes classifier ........................ 18 2.4.2 Decision tree ............................ 20 2.4.3 Random forest ........................... 21 2.4.4 Logistic regression ......................... 21 2.4.5 Support vector machines ...................... 22 2.4.6 k-Nearest Neighbours .......................
    [Show full text]
  • Implementation of Processing in Racket 1 Introduction
    P2R Implementation of Processing in Racket Hugo Correia [email protected] Instituto Superior T´ecnico University of Lisbon Abstract. Programming languages are being introduced in several ar- eas of expertise, including design and architecture. Processing is an ex- ample of one of these languages that was created to teach architects and designers how to program. In spite of offering a wide set of features, Pro- cessing does not support the use of traditional computer-aided design applications, which are heavily used in the architecture industry. Rosetta is a generative design tool based on the Racket language that attempts to solve this problem. Rosetta provides a novel approach to de- sign creation, offering a set of programming languages that generate de- signs in different computer-aided design applications. However, Rosetta does not support Processing. Therefore, the goal is to add Processing to Rosetta's language set, offering architects that know Processing, an alternative solution that supports computer-aided design applications. In order to achieve this goal, a source-to-source compiler that translates Processing into Racket will be developed. This will also give the Racket community the ability to use Processing in the Racket environment, and, at the same time, will allow the Processing community to take advantage of Racket's libraries and development environment. In this report, an analysis of different language implementation mecha- nisms will be presented, focusing on the different steps of the compilation phase, as well as higher-level solutions, including Language Workbenches. In order to gain insight of source-to-source compiler creation, relevant existing source-to-source compilers are presented.
    [Show full text]
  • Intellibot: a Domain-Specific Chatbot for the Insurance Industry
    IntelliBot: A Domain-specific Chatbot for the Insurance Industry MOHAMMAD NURUZZAMAN A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy UNSW Canberra at Australia Defence Force Academy (ADFA) School of Business 20 October 2020 ORIGINALITY STATEMENT ‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institute, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project’s design and conception or in style, presentation and linguistic expression is acknowledged.’ Signed Date To my beloved parents Acknowledgement Writing a thesis is a great process to review not only my academic work but also the journey I took as a PhD student. I have spent four lovely years at UNSW Canberra in the Australian Defence Force Academy (ADFA). Throughout my journey in graduate school, I have been fortunate to come across so many brilliant researchers and genuine friends. It is the people who I met shaped who I am today. This thesis would not have been possible without them. My gratitude goes out to all of them.
    [Show full text]
  • Formal Concept Analysis in Knowledge Processing: a Survey on Applications ⇑ Jonas Poelmans A,C, , Dmitry I
    Expert Systems with Applications 40 (2013) 6538–6560 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa Review Formal concept analysis in knowledge processing: A survey on applications ⇑ Jonas Poelmans a,c, , Dmitry I. Ignatov c, Sergei O. Kuznetsov c, Guido Dedene a,b a KU Leuven, Faculty of Business and Economics, Naamsestraat 69, 3000 Leuven, Belgium b Universiteit van Amsterdam Business School, Roetersstraat 11, 1018 WB Amsterdam, The Netherlands c National Research University Higher School of Economics, Pokrovsky boulevard, 11, 109028 Moscow, Russia article info abstract Keywords: This is the second part of a large survey paper in which we analyze recent literature on Formal Concept Formal concept analysis (FCA) Analysis (FCA) and some closely related disciplines using FCA. We collected 1072 papers published Knowledge discovery in databases between 2003 and 2011 mentioning terms related to Formal Concept Analysis in the title, abstract and Text mining keywords. We developed a knowledge browsing environment to support our literature analysis process. Applications We use the visualization capabilities of FCA to explore the literature, to discover and conceptually repre- Systematic literature overview sent the main research topics in the FCA community. In this second part, we zoom in on and give an extensive overview of the papers published between 2003 and 2011 which applied FCA-based methods for knowledge discovery and ontology engineering in various application domains. These domains include software mining, web analytics, medicine, biology and chemistry data. Ó 2013 Elsevier Ltd. All rights reserved. 1. Introduction mining, i.e. gaining insight into source code with FCA.
    [Show full text]