Using Linked Data for Semi-Automatic Guesstimation

Total Page:16

File Type:pdf, Size:1020Kb

Using Linked Data for Semi-Automatic Guesstimation Using Linked Data for Semi-Automatic Guesstimation Jonathan A. Abourbih and Alan Bundy and Fiona McNeill∗ [email protected], [email protected], [email protected] University of Edinburgh, School of Informatics 10 Crichton Street, Edinburgh, EH8 9AB, United Kingdom Abstract and Semantic Web systems. Next, we outline the process of GORT is a system that combines Linked Data from across guesstimation. Then, we describe the organisation and im- several Semantic Web data sources to solve guesstimation plementation of GORT. Finally, we close with an evaluation problems, with user assistance. The system uses customised of the system’s performance and adaptability, and compare inference rules over the relationships in the OpenCyc ontol- it to several other related systems. We also conclude with a ogy, combined with data from DBPedia, to reason and per- brief section on future work. form its calculations. The system is extensible with new Linked Data, as it becomes available, and is capable of an- Literature Survey swering a small range of guesstimation questions. Combining facts to answer a user query is a mature field. The DEDUCOM system (Slagle 1965) was one of the ear- Introduction liest systems to perform deductive query answering. DE- The true power of the Semantic Web will come from com- DUCOM applies procedural knowledge to a set of facts in bining information from heterogeneous data sources to form a knowledge base to answer user queries, and a user can new knowledge. A system that is capable of deducing an an- also supplement the knowledge base with further facts. DE- swer to a query, even when necessary knowledge might be DUCOM also uses theorem proving techniques to show that missing or incomplete, could have broad applications and the calculated answer follows logically from the axioms appeal. In this paper, we describe a system that uses Linked in the knowledge base. The QUARK/SNARK/Geo-Logica Data to perform such compound query answering, as applied systems (Waldinger et al. 2003) exploit large, external data to the domain of guesstimation—the process of finding ap- sources to do deductive query answering, however their fo- proximate (within an order of magnitude) numeric answers cus was on yes/no and geographical queries and not numeric to queries, potentially combining unrelated knowledge in in- calculations. Cyc (Lenat 1995) is a large, curated knowledge teresting ways (Weinstein and Adam 2008). The hypothesis base that can also apply deductive reasoning to derive new that we are attempting to test is that data contained in het- numeric facts in response to a user query. Wolfram|Alpha erogeneous Linked Data sources can be reasoned over and (http://www.wolframalpha.com) is a new system combined, using a small set of rules, to calculate new infor- that combines facts in a curated knowledge base to arrive mation, with occasional human intervention. at new knowledge, and can produce a surprisingly large In this paper, we describe our implementation of a sys- amount of information from a very simple query. tem, called GORT (Guesstimation with Ontologies and Rea- Query answering over the Semantic Web is a new soning Techniques), that does compound reasoning to an- field. The most developed system in this field is Power- swer a small range of guesstimation questions. The sys- Aqua (Lopez, Motta, and Uren 2006), which uses a Se- tem’s inference rules are implemented in Prolog using its Se- mantic Web search engine to do fact retrieval, but it does mantic Web module (Wielemaker, Schreiber, and Wielinga not do any deduction based on that knowledge. There has 2003) for RDF inference. The system uses the OpenCyc been work on using Prolog’s Semantic Web module in con- OWL ontology as a basis for its reasoning, and uses facts junction with custom-built ontologies to construct interac- from other Linked Data sources in its calculations. This tive query systems (Wielemaker 2009). Our GORT system, work is the result of a three-month-long Master’s disserta- described in this paper, builds on prior work in knowledge- tion project (Abourbih 2009), supervised by Bundy and Mc- based systems and work on Semantic Web reasoning with Neill. Only the advent of interlinked data sources has made Prolog. The system implements inference rules and plans this system possible. for combining facts in Semantic Web data sources to derive The rest of this paper is organised as follows: First, we de- new information in response to a user query. scribe prior work in the fields of deductive query answering ∗Funded by ONR project N000140910467 Guesstimation Copyright c 2010, Association for the Advancement of Artificial Guesstimation is a method of calculating an approximate Intelligence (www.aaai.org). All rights reserved. answer to a query when some needed fact is partially or 2 completely missing. The types of problems that lend them- Humans, the population of the Earth, is about 6 × 109. selves well to guesstimation are those for which a precise an- Substituting into (3) gives: swer cannot be known or easily found. Weinstein and Adam (Humans) × (2008) present several dozen problems that lend themselves Area Humans to the method, and illustrate a range of guesstimation tac- ≈> 1m2 × (6 × 109) tics. Two worked examples, of the type that GORT has been ≈> 6 × 109 m2 designed to solve, are given below. Example 1. How many golf balls would it take to circle the Guesstimation Plans Earth at the equator? The examples from the previous section follow patterns that are common to a range of guesstimation queries. In this sec- Solution. It is clear that this problem can be solved by cal- tion, we generalise the examples above into plans that can culating: be instantiated for other similar problems. Circumference(Earth) (1) Diameter(GolfBall) Count Plan The solution requires the circumference of the Earth and the Example 1 asks for a count of how many golf balls fit inside diameter of a golf ball. Weinstein and Adam (2008) show the circumference of the Earth. This can be generalised to how the Earth’s circumference can be deduced from com- the form, “How many of some Smaller entity will fit inside mon knowledge: a 5 hour flight from Los Angeles to New of some Bigger entity?” and expressed as: York crosses 3 time zones at a speed of about 1000 km/h, F (Bigger) Count(Smaller, Bigger) ≈> , (4) and there are 24 time zones around the Earth, yielding: G(Smaller) 5h × 24 time zones × 103km/h where Count(Smaller, Bigger) represents the count of 3 time zones how many Smaller objects fit inside some Bigger object. ≈ 4 × 104 km ≈ 4 × 107 m. Functions F (x) and G(x) represent measurements on an ob- ject, like volume, mass, length, or duration. The units of Note that in (1), Earth is an individual object, but GolfBall measurement must be the same. The variables Smaller and does not denote any particular golf ball, but instead an arbi- Bigger may be instantiated with a class, or an instance. In trary, or ‘prototypical’ golf ball. To distinguish between the Example 1, this guesstimation plan was instantiated with the class of objects called GolfBall and the arbitrary golf ball, substitutions: we refer to the latter as GolfBall. Continuing, we make an educated guess for Diameter(GolfBall) of about 4 cm. Smaller/GolfBall , and Substituting into (1) gives: Bigger/Earth . ( ) 4 × 107m Circumference Earth ≈> ≈> 109 Size Plan −2 Diameter(GolfBall) 4 × 10 m Example 2 asks for the total area that would be occupied S by the set of all humans. This, too, can be generalised to: In the solution to Example 1, the notation represents F (e) e S “What is the total of some property for all instances an arbitrary, ‘prototypical’ element from the set . This no- S of a class ?” and expressed as: tation and interpretation of our -operator is based on that of the Hilbert- operator (Avigad and Zach 2007). F (e) ≈>F(S) ×S, (5) Example 2. What would be the total area occupied if all e∈S humans in the world were crammed into one place? where S represents the count of elements in set S, and S is an arbitrary, ‘typical’ element of S. In Example 2, Solution. We begin by summing up the area required by this guesstimation plan was instantiated with the substitution each human on Earth (2). If we assume that every human S/Humans. requires the same area, to within an order of magnitude, then this rewrites to: Implementation Area(e) (2) A system to perform compound query answering has a small e∈Humans set of requirements. First, the system needs some general ≈> Area(Humans) ×Humans, (3) background knowledge of the nature of the world. This knowledge is encoded in a basic ontology that forms the where Human is an arbitrary, representative element of heart of the system’s reasoning. Second, the system needs the set of all Humans, and ≈> represents a rewrite that access to some set of numeric ground facts with which to is accurate to within an order of magnitude. The value compute answers. These facts may be available directly in for Area(Human) is a somewhat arbitrary choice (how the basic ontology, or from other linked data sources. Third, close is too close?), so we make an educated guess of about the system needs to have some procedural knowledge, and 1m2. It is also generally-accepted common knowledge that needs to know how to apply the procedures to advance a 3 tion of Canada from the CIA Factbook ontology if it is pro- User Interface vided with the URIs for Canada and population. The sys- tem is also capable of retrieving a fact for a given Subject Inference System Knowledge Base and Predicate by following the graph of owl:sameAs rela- Domain Domain Basic tionships on Subject and rdfs:subPropertyOf relationships Independent Specific Ontology on Predicate until an appropriate Object is found.
Recommended publications
  • A Survey of Top-Level Ontologies to Inform the Ontological Choices for a Foundation Data Model
    A survey of Top-Level Ontologies To inform the ontological choices for a Foundation Data Model Version 1 Contents 1 Introduction and Purpose 3 F.13 FrameNet 92 2 Approach and contents 4 F.14 GFO – General Formal Ontology 94 2.1 Collect candidate top-level ontologies 4 F.15 gist 95 2.2 Develop assessment framework 4 F.16 HQDM – High Quality Data Models 97 2.3 Assessment of candidate top-level ontologies F.17 IDEAS – International Defence Enterprise against the framework 5 Architecture Specification 99 2.4 Terminological note 5 F.18 IEC 62541 100 3 Assessment framework – development basis 6 F.19 IEC 63088 100 3.1 General ontological requirements 6 F.20 ISO 12006-3 101 3.2 Overarching ontological architecture F.21 ISO 15926-2 102 framework 8 F.22 KKO: KBpedia Knowledge Ontology 103 4 Ontological commitment overview 11 F.23 KR Ontology – Knowledge Representation 4.1 General choices 11 Ontology 105 4.2 Formal structure – horizontal and vertical 14 F.24 MarineTLO: A Top-Level 4.3 Universal commitments 33 Ontology for the Marine Domain 106 5 Assessment Framework Results 37 F. 25 MIMOSA CCOM – (Common Conceptual 5.1 General choices 37 Object Model) 108 5.2 Formal structure: vertical aspects 38 F.26 OWL – Web Ontology Language 110 5.3 Formal structure: horizontal aspects 42 F.27 ProtOn – PROTo ONtology 111 5.4 Universal commitments 44 F.28 Schema.org 112 6 Summary 46 F.29 SENSUS 113 Appendix A F.30 SKOS 113 Pathway requirements for a Foundation Data F.31 SUMO 115 Model 48 F.32 TMRM/TMDM – Topic Map Reference/Data Appendix B Models 116 ISO IEC 21838-1:2019
    [Show full text]
  • Artificial Intelligence
    BROAD AI now and later Michael Witbrock, PhD University of Auckland Broad AI Lab @witbrock Aristotle (384–322 BCE) Organon ROOTS OF AI ROOTS OF AI Santiago Ramón y Cajal (1852 -1934) Cerebral Cortex WHAT’S AI • OLD definition: AI is everything we don’t yet know how program • Now some things that people can’t do: • unique capabilities (e.g. Style transfer) • superhuman performance (some areas of speech, vision, games, some QA, etc) • Current AI Systems can be divided by their kind of capability: • Skilled (Image recognition, Game Playing (Chess, Atari, Go, DoTA), Driving) • Attentive (Trading: Aidyia; Senior Care: CareMedia, Driving) • Knowledgeable, (Google Now, Siri, Watson, Cortana) • High IQ (Cyc, Soar, Wolfram Alpha) GOFAI • Thought is symbol manipulation • Large numbers of precisely defined symbols (terms) • Based on mathematical logic (implies (and (isa ?INST1 LegalAgreement) (agreeingAgents ?INST1 ?INST2)) (isa ?INST2 LegalAgent)) • Problems solved by searching for transformations of symbolic representations that lead to a solution Slow Development Thinking Quickly Thinking Slowly (System I) (System II) Human Superpower c.f. other Done well by animals and people animals Massively parallel algorithms Serial and slow Done poorly until now by computers Done poorly by most people Not impressive to ordinary people Impressive (prizes, high pay) "Sir, an animal’s reasoning is like a dog's walking on his hind legs. It is not done well; but you are surprised to find it done at all.“ - apologies to Samuel Johnson Achieved on computers by high- Fundamental design principle of power, low density, slow computers simulation of vastly different Computer superpower c.f. neural hardware human Recurrent Deep Learning & Deep Reasoning MACHINE LEARNING • Meaning is implicit in the data • Thought is the transformation of learned representations http://karpathy.github.io/2015/05/21/rnn- effectiveness/ .
    [Show full text]
  • Knowledge Graphs on the Web – an Overview Arxiv:2003.00719V3 [Cs
    January 2020 Knowledge Graphs on the Web – an Overview Nicolas HEIST, Sven HERTLING, Daniel RINGLER, and Heiko PAULHEIM Data and Web Science Group, University of Mannheim, Germany Abstract. Knowledge Graphs are an emerging form of knowledge representation. While Google coined the term Knowledge Graph first and promoted it as a means to improve their search results, they are used in many applications today. In a knowl- edge graph, entities in the real world and/or a business domain (e.g., people, places, or events) are represented as nodes, which are connected by edges representing the relations between those entities. While companies such as Google, Microsoft, and Facebook have their own, non-public knowledge graphs, there is also a larger body of publicly available knowledge graphs, such as DBpedia or Wikidata. In this chap- ter, we provide an overview and comparison of those publicly available knowledge graphs, and give insights into their contents, size, coverage, and overlap. Keywords. Knowledge Graph, Linked Data, Semantic Web, Profiling 1. Introduction Knowledge Graphs are increasingly used as means to represent knowledge. Due to their versatile means of representation, they can be used to integrate different heterogeneous data sources, both within as well as across organizations. [8,9] Besides such domain-specific knowledge graphs which are typically developed for specific domains and/or use cases, there are also public, cross-domain knowledge graphs encoding common knowledge, such as DBpedia, Wikidata, or YAGO. [33] Such knowl- edge graphs may be used, e.g., for automatically enriching data with background knowl- arXiv:2003.00719v3 [cs.AI] 12 Mar 2020 edge to be used in knowledge-intensive downstream applications.
    [Show full text]
  • Why Has AI Failed? and How Can It Succeed?
    Why Has AI Failed? And How Can It Succeed? John F. Sowa VivoMind Research, LLC 10 May 2015 Extended version of slides for MICAI'14 ProblemsProblems andand ChallengesChallenges Early hopes for artificial intelligence have not been realized. Language understanding is more difficult than anyone thought. A three-year-old child is better able to learn, understand, and generate language than any current computer system. Tasks that are easy for many animals are impossible for the latest and greatest robots. Questions: ● Have we been using the right theories, tools, and techniques? ● Why haven’t these tools worked as well as we had hoped? ● What other methods might be more promising? ● What can research in neuroscience and psycholinguistics tell us? ● Can it suggest better ways of designing intelligent systems? 2 Early Days of Artificial Intelligence 1960: Hao Wang’s theorem prover took 7 minutes to prove all 378 FOL theorems of Principia Mathematica on an IBM 704 – much faster than two brilliant logicians, Whitehead and Russell. 1960: Emile Delavenay, in a book on machine translation: “While a great deal remains to be done, it can be stated without hesitation that the essential has already been accomplished.” 1965: Irving John Good, in speculations on the future of AI: “It is more probable than not that, within the twentieth century, an ultraintelligent machine will be built and that it will be the last invention that man need make.” 1968: Marvin Minsky, technical adviser for the movie 2001: “The HAL 9000 is a conservative estimate of the level of artificial intelligence in 2001.” 3 The Ultimate Understanding Engine Sentences uttered by a child named Laura before the age of 3.
    [Show full text]
  • Logic-Based Technologies for Intelligent Systems: State of the Art and Perspectives
    information Article Logic-Based Technologies for Intelligent Systems: State of the Art and Perspectives Roberta Calegari 1,* , Giovanni Ciatto 2 , Enrico Denti 3 and Andrea Omicini 2 1 Alma AI—Alma Mater Research Institute for Human-Centered Artificial Intelligence, Alma Mater Studiorum–Università di Bologna, 40121 Bologna, Italy 2 Dipartimento di Informatica–Scienza e Ingegneria (DISI), Alma Mater Studiorum–Università di Bologna, 47522 Cesena, Italy; [email protected] (G.C.); [email protected] (A.O.) 3 Dipartimento di Informatica–Scienza e Ingegneria (DISI), Alma Mater Studiorum–Università di Bologna, 40136 Bologna, Italy; [email protected] * Correspondence: [email protected] Received: 25 February 2020; Accepted: 18 March 2020; Published: 22 March 2020 Abstract: Together with the disruptive development of modern sub-symbolic approaches to artificial intelligence (AI), symbolic approaches to classical AI are re-gaining momentum, as more and more researchers exploit their potential to make AI more comprehensible, explainable, and therefore trustworthy. Since logic-based approaches lay at the core of symbolic AI, summarizing their state of the art is of paramount importance now more than ever, in order to identify trends, benefits, key features, gaps, and limitations of the techniques proposed so far, as well as to identify promising research perspectives. Along this line, this paper provides an overview of logic-based approaches and technologies by sketching their evolution and pointing out their main application areas. Future perspectives for exploitation of logic-based technologies are discussed as well, in order to identify those research fields that deserve more attention, considering the areas that already exploit logic-based approaches as well as those that are more likely to adopt logic-based approaches in the future.
    [Show full text]
  • Knowledge on the Web: Towards Robust and Scalable Harvesting of Entity-Relationship Facts
    Knowledge on the Web: Towards Robust and Scalable Harvesting of Entity-Relationship Facts Gerhard Weikum Max Planck Institute for Informatics http://www.mpi-inf.mpg.de/~weikum/ Acknowledgements 2/38 Vision: Turn Web into Knowledge Base comprehensive DB knowledge fact of human knowledge assets extraction • everything that (Semantic (Statistical Web) Web) Wikipedia knows • machine-readable communities • capturing entities, (Social Web) classes, relationships Source: DB & IR methods for knowledge discovery. Communications of the ACM 52(4), 2009 3/38 Knowledge as Enabling Technology • entity recognition & disambiguation • understanding natural language & speech • knowledge services & reasoning for semantic apps • semantic search: precise answers to advanced queries (by scientists, students, journalists, analysts, etc.) German chancellor when Angela Merkel was born? Japanese computer science institutes? Politicians who are also scientists? Enzymes that inhibit HIV? Influenza drugs for pregnant women? ... 4/38 Knowledge Search on the Web (1) Query: sushi ingredients? Results: Nori seaweed Ginger Tuna Sashimi ... Unagi http://www.google.com/squared/5/38 Knowledge Search on the Web (1) Query:Query: JapaneseJapanese computerscomputeroOputer science science ? institutes ? http://www.google.com/squared/6/38 Knowledge Search on the Web (2) Query: politicians who are also scientists ? ?x isa politician . ?x isa scientist Results: Benjamin Franklin Zbigniew Brzezinski Alan Greenspan Angela Merkel … http://www.mpi-inf.mpg.de/yago-naga/7/38 Knowledge Search on the Web (2) Query: politicians who are married to scientists ? ?x isa politician . ?x isMarriedTo ?y . ?y isa scientist Results (3): [ Adrienne Clarkson, Stephen Clarkson ], [ Raúl Castro, Vilma Espín ], [ Jeannemarie Devolites Davis, Thomas M. Davis ] http://www.mpi-inf.mpg.de/yago-naga/8/38 Knowledge Search on the Web (3) http://www-tsujii.is.s.u-tokyo.ac.jp/medie/ 9/38 Take-Home Message If music was invented Information is not Knowledge.
    [Show full text]
  • Talking to Computers in Natural Language
    feature Talking to Computers in Natural Language Natural language understanding is as old as computing itself, but recent advances in machine learning and the rising demand of natural-language interfaces make it a promising time to once again tackle the long-standing challenge. By Percy Liang DOI: 10.1145/2659831 s you read this sentence, the words on the page are somehow absorbed into your brain and transformed into concepts, which then enter into a rich network of previously-acquired concepts. This process of language understanding has so far been the sole privilege of humans. But the universality of computation, Aas formalized by Alan Turing in the early 1930s—which states that any computation could be done on a Turing machine—offers a tantalizing possibility that a computer could understand language as well. Later, Turing went on in his seminal 1950 article, “Computing Machinery and Intelligence,” to propose the now-famous Turing test—a bold and speculative method to evaluate cial intelligence at the time. Daniel Bo- Figure 1). SHRDLU could both answer whether a computer actually under- brow built a system for his Ph.D. thesis questions and execute actions, for ex- stands language (or more broadly, is at MIT to solve algebra word problems ample: “Find a block that is taller than “intelligent”). While this test has led to found in high-school algebra books, the one you are holding and put it into the development of amusing chatbots for example: “If the number of custom- the box.” In this case, SHRDLU would that attempt to fool human judges by ers Tom gets is twice the square of 20% first have to understand that the blue engaging in light-hearted banter, the of the number of advertisements he runs, block is the referent and then perform grand challenge of developing serious and the number of advertisements is 45, the action by moving the small green programs that can truly understand then what is the numbers of customers block out of the way and then lifting the language in useful ways remains wide Tom gets?” [1].
    [Show full text]
  • Wikitology Wikipedia As an Ontology
    Creating and Exploiting a Web of Semantic Data Tim Finin University of Maryland, Baltimore County joint work with Zareen Syed (UMBC) and colleagues at the Johns Hopkins University Human Language Technology Center of Excellence ICAART 2010, 24 January 2010 http://ebiquity.umbc.edu/resource/html/id/288/ Overview • Introduction (and conclusion) • A Web of linked data • Wikitology • Applications • Conclusion introduction linked data wikitology applications conclusion Conclusions • The Web has made people smarter and more capable, providing easy access to the world's knowledge and services • Software agents need better access to a Web of data and knowledge to enhance their intelligence • Some key technologies are ready to exploit: Semantic Web, linked data, RDF search engines, DBpedia, Wikitology, information extraction, etc. introduction linked data wikitology applications conclusion The Age of Big Data • Massive amounts of data is available today on the Web, both for people and agents • This is what‟s driving Google, Bing, Yahoo • Human language advances also driven by avail- ability of unstructured data, text and speech • Large amounts of structured & semi-structured data is also coming online, including RDF • We can exploit this data to enhance our intelligent agents and services introduction linked data wikitology applications conclusion Twenty years ago… Tim Berners-Lee‟s 1989 WWW proposal described a web of relationships among named objects unifying many info. management tasks. Capsule history • Guha‟s MCF (~94) • XML+MCF=>RDF
    [Show full text]
  • AI/ML Finding a Doing Machine
    Digital Readiness: AI/ML Finding a doing machine Gregg Vesonder Stevens Institute of Technology http://vesonder.com Three Talks • Digital Readiness: AI/ML, The thinking system quest. – Artificial Intelligence and Machine Learning (AI/ML) have had a fascinating evolution from 1950 to the present. This talk sketches the main themes of AI and machine learning, tracing the evolution of the field since its beginning in the 1950s and explaining some of its main concepts. These eras are characterized as “from knowledge is power” to “data is king”. • Digital Readiness: AI/ML, Finding a doing machine. – In the last decade Machine Learning had a remarkable success record. We will review reasons for that success, review the technology, examine areas of need and explore what happened to the rest of AI, GOFAI (Good Old Fashion AI). • Digital Readiness: AI/ML, Common Sense prevails? – Will there be another AI Winter? We will explore some clues to where the current AI/ML may reunite with GOFAI (Good Old Fashioned AI) and hopefully expand the utility of both. This will include extrapolating on the necessary melding of AI with engineering, particularly systems engineering. Roadmap • Systems – Watson – CYC – NELL – Alexa, Siri, Google Home • Technologies – Semantic web – GPUs and CUDA – Back office (Hadoop) – ML Bias • Last week’s questions Winter is Coming? • First Summer: Irrational Exuberance (1948 – 1966) • First Winter (1967 – 1977) • Second Summer: Knowledge is Power (1978 – 1987) • Second Winter (1988 – 2011) • Third Summer (2012 – ?) • Why there might not be a third winter! Henry Kautz – Engelmore Lecture SYSTEMS Winter 2 Systems • Knowledge is power theme • Influence of the web, try to represent all knowledge – Creating a general ontology organizing everything in the world into a hierarchy of categories – Successful deep ontologies: Gene Ontology and CML Chemical Markup Language • Indeed extreme knowledge – CYC and Open CYC – IBM’s Watson Upper ontology of the world Russell and Norvig figure 12.1 Properties of a subject area and how they are related Ferrucci, D., et.al.
    [Show full text]
  • Integrating Problem-Solving Methods Into Cyc
    Integrating Problem-Solving Methods into Cyc James Stuart Aitken Dimitrios Sklavakis Artificial Intelligence Applications Institute School of Artificial Intelligence Division of Informatics Division of Informatics University of Edinburgh University of Edinburgh Edinburgh EH1 1HN, Scotland Edinburgh EH1 1HN, Scotland Email: stuartQaiai.ed.ac.uk Email: dimitrisQdai .ed.ac.uk Abstract is unique in that it, has potential solutions to each of these problems. This paper argues that the reuse of domain knowledge must be complemented by the reuse Cyc uses a resolution-based inference procedure that of problem-solving methods. Problem-solving has a number of optimisations that improve the scalabil• methods (PSMs) provide a means to struc• ity of the architectures For example, a specialised taxo• ture search, and can provide tractable solu• nornic reasoning module replaces the application of the tions to reasoning with a very large knowl- logical rule for transitivity of class membership. Where edge base. We show that PSMs can be used specialised modules are not implemented, Cyc makes use in a way which complements large-scale rep• of weak search methods to perform inference. Cyc lacks resentation techniques, and optimisations such any principles for structuring inference at a conceptual as those for taxonornie reasoning found in level. Problem-solving methods provide precisely this Cyc. Our approach illustrates the advantages structure, hence the importance of integrating structur• of task-oriented knowledge modelling and we ing principles into a scalable KBS architecture. demonstrate that the resulting ontologies have both task-dependent and task-independent el• Robustness and reusability are related properties of ements. Further, we show how the task ontol• the knowledge representation scheme and the inference ogy can be organised into conceptual levels to rules: Predicates such as bordersOn and between, defined reflect knowledge typing principles.
    [Show full text]
  • The Applicability of Natural Language Processing (NLP) To
    THE AMERICAN ARCHIVIST The Applicability of Natural Language Processing (NLP) to Archival Properties and Downloaded from http://meridian.allenpress.com/american-archivist/article-pdf/61/2/400/2749134/aarc_61_2_j3p8200745pj34v6.pdf by guest on 01 October 2021 Objectives Jane Greenberg Abstract Natural language processing (NLP) is an extremely powerful operation—one that takes advantage of electronic text and the computer's computational capabilities, which surpass human speed and consistency. How does NLP affect archival operations in the electronic environment? This article introduces archivists to NLP with a presentation of the NLP Continuum and a description of the Archives Axiom, which is supported by an analysis of archival properties and objectives. An overview of the basic information retrieval (IR) framework is provided and NLP's application to the electronic archival environment is discussed. The analysis concludes that while NLP offers advantages for indexing and ac- cessing electronic archives, its incapacity to understand records and recordkeeping systems results in serious limitations for archival operations. Introduction ince the advent of computers, society has been progressing—at first cautiously, but now exponentially—from a dependency on printed text, to functioning almost exclusively with electronic text and other elec- S 1 tronic media in an array of environments. Underlying this transition is the growing accumulation of massive electronic archives, which are prohibitively expensive to index by traditional manual procedures. Natural language pro- cessing (NLP) offers an option for indexing and accessing these large quan- 1 The electronic society's evolution includes optical character recognition (OCR) and other processes for transforming printed text to electronic form. Also included are processes, such as imaging and sound recording, for transforming and recording nonprint electronic manifestations, but this article focuses on electronic text.
    [Show full text]
  • Toward a Logic of Everyday Reasoning
    Toward a Logic of Everyday Reasoning Pei Wang Abstract: Logic should return its focus to valid reasoning in real-world situations. Since classical logic only covers valid reasoning in a highly idealized situation, there is a demand for a new logic for everyday reasoning that is based on more realistic assumptions, while still keeps the general, formal, and normative nature of logic. NAL (Non-Axiomatic Logic) is built for this purpose, which is based on the assumption that the reasoner has insufficient knowledge and resources with respect to the reasoning tasks to be carried out. In this situation, the notion of validity has to be re-established, and the grammar rules and inference rules of the logic need to be designed accordingly. Consequently, NAL has features very different from classical logic and other non-classical logics, and it provides a coherent solution to many problems in logic, artificial intelligence, and cognitive science. Keyword: non-classical logic, uncertainty, openness, relevance, validity 1 Logic and Everyday Reasoning 1.1 The historical changes of logic In a broad sense, the study of logic is concerned with the principles and forms of valid reasoning, inference, and argument in various situations. The first dominating paradigm in logic is Aristotle’s Syllogistic [Aristotle, 1882], now usually referred to as traditional logic. This study was carried by philosophers and logicians including Descartes, Locke, Leibniz, Kant, Boole, Peirce, Mill, and many others [Bochenski,´ 1970, Haack, 1978, Kneale and Kneale, 1962]. In this tra- dition, the focus of the study is to identify and to specify the forms of valid reasoning Pei Wang Department of Computer and Information Sciences, Temple University, Philadelphia, USA e-mail: [email protected] 1 2 Pei Wang in general, that is, the rules of logic should be applicable to all domains and situa- tions where reasoning happens, as “laws of thought”.
    [Show full text]