On the Effective Use of Cyc in a Question Answering System

Total Page:16

File Type:pdf, Size:1020Kb

On the Effective Use of Cyc in a Question Answering System On the Effective Use of Cyc in a Question Answering System Jon Curtis, Gavin Matthews, David Baxter Cycorp, Inc. 3721 Executive Center Drive, Suite 100, Austin, TX 78731. {jonc,gmatthew,baxter}@cyc.com Topics: flexibility supported by discourse/user model, new types of questions, reasoning with incom- plete knowledge, response generation. Abstract assertions) relating them3, Cyc is the only AI program in existence today that can reasonably claim to have some de- We describe a commercial question-answering sys- gree of common sense. Cyc’s knowledge is represented in tem that uses AI – specifically, the Cyc system – to CycL, a higher-order logical language based on predicate support passage retrieval and perform deductive calculus. Every assertion in Cyc is represented in a context, QA, to produce results superior to what each ques- or microtheory, which allows the representation of compet- tion-answering technique could produce alone. ing theories. Like ordinary concepts, microtheories are ex- plicitly represented as first-class objects in Cyc, giving Cyc 1 Introduction a reflective capability to reason about its own representa- This paper describes a working prototype of a commercial tions. Microtheories form a hierarchy that facilitates knowl- question-answering system that uses artificial intelligence in edge re-use (assertions stored in the most general contexts conjunction with NLP-driven passage retrieval in a way that are always available), and inferential focus (given a query integrates the two markedly different approaches to ques- posed in a specific microtheory, other knowledge from sib- tion-answering and produces better results than either ap- ling or more specific microtheories will not come into play). proach could yield alone.1 MySentient2 Answers 1.0 draws Cyc’s inference engine combines general theorem proving upon the Cyc system, a large knowledge-base, common- (e.g. rule chaining) with specialized reasoning (e.g. sub- sense reasoning system. We describe three areas in which sumption and transitivity). Cyc's knowledge contributes to the system's overall ques- Cyc has been used in commercial web-search systems tion-answering ability, both directly in deductive question- (e.g. HotBot) and in question-answering systems, most re- answering, and indirectly, by supporting passage-retrieval. cently in a purely deductive system for answering AP chem- In particular, we focus on the use of Cyc for: istry questions, developed in collaboration with Vulcan, Inc. 1. augmentation of NLP-based passage retrieval by gen- [Friedland et al., 2004]. Cyc’s rôle in the MySentient sys- erating NL expansions of key concepts mentioned in a tem heralds its first appearance in a commercial question- question; answering system. MySentient makes use of Cyc perva- 2. answering question types that pose a challenge to pas- sively, as a means to augment NLP-based QA, as the basis sage retrieval methods, such as procedural ("How do I for a deductive QA module, and in other capacities, such as …?") and cost/benefit ("Why should I …?"); and clarification and profiling, that will be touched upon here. 3. paraphrasing the results of deductive question answer- ing as NL strings for display to an end user. 3 MySentient Answers 1.0 We close with a discussion of the current limitations of the MySentient Answers 1.0 is a working question-answering integrated system and a description of anticipated extensions system, designed by MySentient Ireland (R&D) Ltd. of to the use of Cyc in future versions. Dublin, Ireland, and implemented in collaboration with Cy- corp, Inc. of Austin, Texas, and the Center for Natural Lan- 2 Cyc guage Processing in Syracuse, New York4. MySentient Cyc is a state-of-the-art artificial intelligence program that Answers has been the subject of extensive demonstration to has been in development since 1984. Drawing upon the interested commercial parties and is expected to be available world’s largest general-purpose knowledge base of over for public access in the near future. 164,000 concepts and 3,300,000 facts (rules and ground 1 The work described in this paper was made possible by the 3 MySentient uses a carefully-chosen subset of the full Cyc financial backing of MySentient. knowledge base with 137,000 concepts and 1,700,000 facts. 2 “MySentient” is a registered trademark of MySentient Ireland 4 See http://www.mysentient.com/, http://www.cyc.com/, and (I.P.) Limited. http://www.cnlp.org/ 3.1 Architecture customer. The final goal of the project is that the cus- tomer’s domain experts will perform much of this speciali- MySentient Answers is based on the S-Core architecture, which integrates disparate components into a uniform XML- zation, using a suite of MySentient Authoring Tools. At the based interface. Components each receive a storybook giv- current stage of development, these tools are in a rudimen- tary state; therefore, much of the authoring done in support ing the history of the interaction, and their output is ap- pended to the appropriate element. This design gives some of the results described in this paper has been simulated by of the flexibility of a blackboard architecture, yet allows blending prototype tools with the intervention of skilled ontologists. This simulation not only provides a proof-of- some powerful simplifying assumptions. concept of the run-time question-answering system, but also permits an informed comparison to be made between the representation capabilities required and the feasible capa- bilities of the planned authoring tools (see Section 8, “Cur- rent Limitations and Future Directions,” below). The cor- pus used for this simulation was provided by the Motley Fool UK, based on its website.5 While coverage of a corpus must start with the corpus it- self, it is also necessary to concentrate on test queries; Cyc’s coverage of the Motley Fool domain therefore had two foci. The corpus was subjected to automated analysis for noun phrase identification and interpretation, plus extraction of glossary entries. A manual pass of review and correction to ensure broad coverage of the domain followed this. Known and blind question sets were prepared (by both Cycorp and MySentient) based on the corpus. The known question sets were analyzed by question type (see Question Types of Interest), and strategies for broad coverage of each question type were devised and implemented. Independently, the quality-assurance team performed daily tests based on the question sets. Results were evalu- ated on a primitive basis by automatic comparison with a growing set of input/output pairs. Each input/output pair was classified as correct, incorrect, or correct but badly paraphrased. Incorrect results were reviewed with a focus on identifying and resolving the broad class of defect (such as missing or erroneous knowledge) rather than fixing prob- lems specific to particular questions. The intensive ontological engineering effort for the Mot- ley Fool UK domain was performed over a four-week pe- Figure 1: MySentient Answers system architecture, riod, and took 691.25 person-hours. The source corpus was showing selected components. equivalent to about 200 pages of text and a total of 286 test questions were prepared for that domain. The NLP-based components also underwent a training process against the 3.2 Question Answering Modules Motley Fool corpus; however, this training was done inde- The S-Core architecture allows any task to be attempted in pendently of the simulated authoring/ontological engineer- parallel (or in sequence) by multiple competing modules. In ing effort done for Cyc. As a limited test of how MySen- MySentient Answers, there are several question-answering tient Answers benefits from integrating both NLP and de- modules: various NLP passage-retrieval modules developed ductive approaches to question-answering, MySentient pre- by CNLP, and a deductive question-answering module pared 132 questions that were posed to the system, and for based on Cyc. All QA modules return NL answers. each question, each QA system was scored on whether it CNLP’s question-answering capabilities are grounded in produced a satisfactory answer. In borderline cases, a half- a quasi-logical representation – Language-to-Logic, or point was awarded. “L2L” that has proven successful in recent TREC question Overall, the multiple CNLP QA modules scored 63% and answering tracks [Diekema, et al., 2000]. the Cyc DQA module scored 34%. This asymmetry is to be expected because of the relative maturity of NLP systems in 3.3 Methodology the QA domain. The federation of QA modules (taking the The Cyc Knowledge Base is essentially open-domain, but deployments of MySentient Answers will be focused on specific domains that reflect the needs and interests of the 5 “The Motley Fool” is a registered trademark of Motley Fool, Inc. See http://www.fool.co.uk/ for the corpus website. high score for each question) scored 79%, a significant im- specific lexicon to the general lexicon for that language. So provement over the individual QA modules. In several for British English, Cyc will generate a sub-context link cases, both modules gave usefully different answers that, from the user-specific lexicon to #$BritishEnglishMt, mak- taken together, form a rounded answer to the user’s ques- ing its data about British spellings, common words, etc., tion. Two interesting examples are: available. For more information on the representation and In response to “How do I protect myself from credit card use of lexicons in Cyc, see [Burns and Davis, 1999]. fraud?”
Recommended publications
  • A Survey of Top-Level Ontologies to Inform the Ontological Choices for a Foundation Data Model
    A survey of Top-Level Ontologies To inform the ontological choices for a Foundation Data Model Version 1 Contents 1 Introduction and Purpose 3 F.13 FrameNet 92 2 Approach and contents 4 F.14 GFO – General Formal Ontology 94 2.1 Collect candidate top-level ontologies 4 F.15 gist 95 2.2 Develop assessment framework 4 F.16 HQDM – High Quality Data Models 97 2.3 Assessment of candidate top-level ontologies F.17 IDEAS – International Defence Enterprise against the framework 5 Architecture Specification 99 2.4 Terminological note 5 F.18 IEC 62541 100 3 Assessment framework – development basis 6 F.19 IEC 63088 100 3.1 General ontological requirements 6 F.20 ISO 12006-3 101 3.2 Overarching ontological architecture F.21 ISO 15926-2 102 framework 8 F.22 KKO: KBpedia Knowledge Ontology 103 4 Ontological commitment overview 11 F.23 KR Ontology – Knowledge Representation 4.1 General choices 11 Ontology 105 4.2 Formal structure – horizontal and vertical 14 F.24 MarineTLO: A Top-Level 4.3 Universal commitments 33 Ontology for the Marine Domain 106 5 Assessment Framework Results 37 F. 25 MIMOSA CCOM – (Common Conceptual 5.1 General choices 37 Object Model) 108 5.2 Formal structure: vertical aspects 38 F.26 OWL – Web Ontology Language 110 5.3 Formal structure: horizontal aspects 42 F.27 ProtOn – PROTo ONtology 111 5.4 Universal commitments 44 F.28 Schema.org 112 6 Summary 46 F.29 SENSUS 113 Appendix A F.30 SKOS 113 Pathway requirements for a Foundation Data F.31 SUMO 115 Model 48 F.32 TMRM/TMDM – Topic Map Reference/Data Appendix B Models 116 ISO IEC 21838-1:2019
    [Show full text]
  • Wikipedia Knowledge Graph with Deepdive
    The Workshops of the Tenth International AAAI Conference on Web and Social Media Wiki: Technical Report WS-16-17 Wikipedia Knowledge Graph with DeepDive Thomas Palomares Youssef Ahres [email protected] [email protected] Juhana Kangaspunta Christopher Re´ [email protected] [email protected] Abstract This paper is organized as follows: first, we review the related work and give a general overview of DeepDive. Sec- Despite the tremendous amount of information on Wikipedia, ond, starting from the data preprocessing, we detail the gen- only a very small amount is structured. Most of the informa- eral methodology used. Then, we detail two applications tion is embedded in unstructured text and extracting it is a non trivial challenge. In this paper, we propose a full pipeline that follow this pipeline along with their specific challenges built on top of DeepDive to successfully extract meaningful and solutions. Finally, we report the results of these applica- relations from the Wikipedia text corpus. We evaluated the tions and discuss the next steps to continue populating Wiki- system by extracting company-founders and family relations data and improve the current system to extract more relations from the text. As a result, we extracted more than 140,000 with a high precision. distinct relations with an average precision above 90%. Background & Related Work Introduction Until recently, populating the large knowledge bases relied on direct contributions from human volunteers as well With the perpetual growth of web usage, the amount as integration of existing repositories such as Wikipedia of unstructured data grows exponentially. Extract- info boxes. These methods are limited by the available ing facts and assertions to store them in a struc- structured data and by human power.
    [Show full text]
  • Artificial Intelligence
    BROAD AI now and later Michael Witbrock, PhD University of Auckland Broad AI Lab @witbrock Aristotle (384–322 BCE) Organon ROOTS OF AI ROOTS OF AI Santiago Ramón y Cajal (1852 -1934) Cerebral Cortex WHAT’S AI • OLD definition: AI is everything we don’t yet know how program • Now some things that people can’t do: • unique capabilities (e.g. Style transfer) • superhuman performance (some areas of speech, vision, games, some QA, etc) • Current AI Systems can be divided by their kind of capability: • Skilled (Image recognition, Game Playing (Chess, Atari, Go, DoTA), Driving) • Attentive (Trading: Aidyia; Senior Care: CareMedia, Driving) • Knowledgeable, (Google Now, Siri, Watson, Cortana) • High IQ (Cyc, Soar, Wolfram Alpha) GOFAI • Thought is symbol manipulation • Large numbers of precisely defined symbols (terms) • Based on mathematical logic (implies (and (isa ?INST1 LegalAgreement) (agreeingAgents ?INST1 ?INST2)) (isa ?INST2 LegalAgent)) • Problems solved by searching for transformations of symbolic representations that lead to a solution Slow Development Thinking Quickly Thinking Slowly (System I) (System II) Human Superpower c.f. other Done well by animals and people animals Massively parallel algorithms Serial and slow Done poorly until now by computers Done poorly by most people Not impressive to ordinary people Impressive (prizes, high pay) "Sir, an animal’s reasoning is like a dog's walking on his hind legs. It is not done well; but you are surprised to find it done at all.“ - apologies to Samuel Johnson Achieved on computers by high- Fundamental design principle of power, low density, slow computers simulation of vastly different Computer superpower c.f. neural hardware human Recurrent Deep Learning & Deep Reasoning MACHINE LEARNING • Meaning is implicit in the data • Thought is the transformation of learned representations http://karpathy.github.io/2015/05/21/rnn- effectiveness/ .
    [Show full text]
  • Knowledge Graphs on the Web – an Overview Arxiv:2003.00719V3 [Cs
    January 2020 Knowledge Graphs on the Web – an Overview Nicolas HEIST, Sven HERTLING, Daniel RINGLER, and Heiko PAULHEIM Data and Web Science Group, University of Mannheim, Germany Abstract. Knowledge Graphs are an emerging form of knowledge representation. While Google coined the term Knowledge Graph first and promoted it as a means to improve their search results, they are used in many applications today. In a knowl- edge graph, entities in the real world and/or a business domain (e.g., people, places, or events) are represented as nodes, which are connected by edges representing the relations between those entities. While companies such as Google, Microsoft, and Facebook have their own, non-public knowledge graphs, there is also a larger body of publicly available knowledge graphs, such as DBpedia or Wikidata. In this chap- ter, we provide an overview and comparison of those publicly available knowledge graphs, and give insights into their contents, size, coverage, and overlap. Keywords. Knowledge Graph, Linked Data, Semantic Web, Profiling 1. Introduction Knowledge Graphs are increasingly used as means to represent knowledge. Due to their versatile means of representation, they can be used to integrate different heterogeneous data sources, both within as well as across organizations. [8,9] Besides such domain-specific knowledge graphs which are typically developed for specific domains and/or use cases, there are also public, cross-domain knowledge graphs encoding common knowledge, such as DBpedia, Wikidata, or YAGO. [33] Such knowl- edge graphs may be used, e.g., for automatically enriching data with background knowl- arXiv:2003.00719v3 [cs.AI] 12 Mar 2020 edge to be used in knowledge-intensive downstream applications.
    [Show full text]
  • Using Linked Data for Semi-Automatic Guesstimation
    Using Linked Data for Semi-Automatic Guesstimation Jonathan A. Abourbih and Alan Bundy and Fiona McNeill∗ [email protected], [email protected], [email protected] University of Edinburgh, School of Informatics 10 Crichton Street, Edinburgh, EH8 9AB, United Kingdom Abstract and Semantic Web systems. Next, we outline the process of GORT is a system that combines Linked Data from across guesstimation. Then, we describe the organisation and im- several Semantic Web data sources to solve guesstimation plementation of GORT. Finally, we close with an evaluation problems, with user assistance. The system uses customised of the system’s performance and adaptability, and compare inference rules over the relationships in the OpenCyc ontol- it to several other related systems. We also conclude with a ogy, combined with data from DBPedia, to reason and per- brief section on future work. form its calculations. The system is extensible with new Linked Data, as it becomes available, and is capable of an- Literature Survey swering a small range of guesstimation questions. Combining facts to answer a user query is a mature field. The DEDUCOM system (Slagle 1965) was one of the ear- Introduction liest systems to perform deductive query answering. DE- The true power of the Semantic Web will come from com- DUCOM applies procedural knowledge to a set of facts in bining information from heterogeneous data sources to form a knowledge base to answer user queries, and a user can new knowledge. A system that is capable of deducing an an- also supplement the knowledge base with further facts.
    [Show full text]
  • Why Has AI Failed? and How Can It Succeed?
    Why Has AI Failed? And How Can It Succeed? John F. Sowa VivoMind Research, LLC 10 May 2015 Extended version of slides for MICAI'14 ProblemsProblems andand ChallengesChallenges Early hopes for artificial intelligence have not been realized. Language understanding is more difficult than anyone thought. A three-year-old child is better able to learn, understand, and generate language than any current computer system. Tasks that are easy for many animals are impossible for the latest and greatest robots. Questions: ● Have we been using the right theories, tools, and techniques? ● Why haven’t these tools worked as well as we had hoped? ● What other methods might be more promising? ● What can research in neuroscience and psycholinguistics tell us? ● Can it suggest better ways of designing intelligent systems? 2 Early Days of Artificial Intelligence 1960: Hao Wang’s theorem prover took 7 minutes to prove all 378 FOL theorems of Principia Mathematica on an IBM 704 – much faster than two brilliant logicians, Whitehead and Russell. 1960: Emile Delavenay, in a book on machine translation: “While a great deal remains to be done, it can be stated without hesitation that the essential has already been accomplished.” 1965: Irving John Good, in speculations on the future of AI: “It is more probable than not that, within the twentieth century, an ultraintelligent machine will be built and that it will be the last invention that man need make.” 1968: Marvin Minsky, technical adviser for the movie 2001: “The HAL 9000 is a conservative estimate of the level of artificial intelligence in 2001.” 3 The Ultimate Understanding Engine Sentences uttered by a child named Laura before the age of 3.
    [Show full text]
  • Expert Systems: an Introduction -46
    GENERAL I ARTICLE Expert Systems: An Introduction K S R Anjaneyulu Expert systems encode human expertise in limited domains by representing it using if-then rules. This article explains the development and applications of expert systems. Introduction Imagine a piece of software that runs on your PC which provides the same sort of interaction and advice as a career counsellor helping you decide what education field to go into K S R Anjaneyulu is a and perhaps what course to pursue .. Or a piece of software Research Scientist in the Knowledge Based which asks you questions about your defective TV and provides Computer Systems Group a diagnosis about what is wrong with it. Such software, called at NeST. He is one of the expert systems, actually exist. Expert systems are a part of the authors of a book on rule larger area of Artificial Intelligence. based expert systems. One of the goals of Artificial Intelligence is to develop systems which exhibit 'intelligent' human-like behaviour. Initial attempts in the field (in the 1960s) like the General Problem Solver created by Allen Newell and Herbert Simon from Carnegie Mellon University were to create general purpose intelligent systems which could handle tasks in a variety of domains. However, researchers soon realized that developing such general purpose systems was too difficult and it was better to focus on systems for limited domains. Edward Feigenbaum from Stanford University then proposed the notion of expert systems. Expert systems are systems which encode human expertise in limited domains. One way of representing this human knowledge is using If-then rules.
    [Show full text]
  • Logic-Based Technologies for Intelligent Systems: State of the Art and Perspectives
    information Article Logic-Based Technologies for Intelligent Systems: State of the Art and Perspectives Roberta Calegari 1,* , Giovanni Ciatto 2 , Enrico Denti 3 and Andrea Omicini 2 1 Alma AI—Alma Mater Research Institute for Human-Centered Artificial Intelligence, Alma Mater Studiorum–Università di Bologna, 40121 Bologna, Italy 2 Dipartimento di Informatica–Scienza e Ingegneria (DISI), Alma Mater Studiorum–Università di Bologna, 47522 Cesena, Italy; [email protected] (G.C.); [email protected] (A.O.) 3 Dipartimento di Informatica–Scienza e Ingegneria (DISI), Alma Mater Studiorum–Università di Bologna, 40136 Bologna, Italy; [email protected] * Correspondence: [email protected] Received: 25 February 2020; Accepted: 18 March 2020; Published: 22 March 2020 Abstract: Together with the disruptive development of modern sub-symbolic approaches to artificial intelligence (AI), symbolic approaches to classical AI are re-gaining momentum, as more and more researchers exploit their potential to make AI more comprehensible, explainable, and therefore trustworthy. Since logic-based approaches lay at the core of symbolic AI, summarizing their state of the art is of paramount importance now more than ever, in order to identify trends, benefits, key features, gaps, and limitations of the techniques proposed so far, as well as to identify promising research perspectives. Along this line, this paper provides an overview of logic-based approaches and technologies by sketching their evolution and pointing out their main application areas. Future perspectives for exploitation of logic-based technologies are discussed as well, in order to identify those research fields that deserve more attention, considering the areas that already exploit logic-based approaches as well as those that are more likely to adopt logic-based approaches in the future.
    [Show full text]
  • Knowledge on the Web: Towards Robust and Scalable Harvesting of Entity-Relationship Facts
    Knowledge on the Web: Towards Robust and Scalable Harvesting of Entity-Relationship Facts Gerhard Weikum Max Planck Institute for Informatics http://www.mpi-inf.mpg.de/~weikum/ Acknowledgements 2/38 Vision: Turn Web into Knowledge Base comprehensive DB knowledge fact of human knowledge assets extraction • everything that (Semantic (Statistical Web) Web) Wikipedia knows • machine-readable communities • capturing entities, (Social Web) classes, relationships Source: DB & IR methods for knowledge discovery. Communications of the ACM 52(4), 2009 3/38 Knowledge as Enabling Technology • entity recognition & disambiguation • understanding natural language & speech • knowledge services & reasoning for semantic apps • semantic search: precise answers to advanced queries (by scientists, students, journalists, analysts, etc.) German chancellor when Angela Merkel was born? Japanese computer science institutes? Politicians who are also scientists? Enzymes that inhibit HIV? Influenza drugs for pregnant women? ... 4/38 Knowledge Search on the Web (1) Query: sushi ingredients? Results: Nori seaweed Ginger Tuna Sashimi ... Unagi http://www.google.com/squared/5/38 Knowledge Search on the Web (1) Query:Query: JapaneseJapanese computerscomputeroOputer science science ? institutes ? http://www.google.com/squared/6/38 Knowledge Search on the Web (2) Query: politicians who are also scientists ? ?x isa politician . ?x isa scientist Results: Benjamin Franklin Zbigniew Brzezinski Alan Greenspan Angela Merkel … http://www.mpi-inf.mpg.de/yago-naga/7/38 Knowledge Search on the Web (2) Query: politicians who are married to scientists ? ?x isa politician . ?x isMarriedTo ?y . ?y isa scientist Results (3): [ Adrienne Clarkson, Stephen Clarkson ], [ Raúl Castro, Vilma Espín ], [ Jeannemarie Devolites Davis, Thomas M. Davis ] http://www.mpi-inf.mpg.de/yago-naga/8/38 Knowledge Search on the Web (3) http://www-tsujii.is.s.u-tokyo.ac.jp/medie/ 9/38 Take-Home Message If music was invented Information is not Knowledge.
    [Show full text]
  • Racer - an Inference Engine for the Semantic Web
    Racer - An Inference Engine for the Semantic Web Volker Haarslev Concordia University Department of Computer Science and Software Enineering http://www.cse.concordia.ca/~haarslev/ Collaboration with: Ralf Möller, Hamburg University of Science and Technology Basic Web Technology (1) Uniform Resource Identifier (URI) foundation of the Web identify items on the Web uniform resource locator (URL): special form of URI Extensible Markup Language (XML) send documents across the Web allows anyone to design own document formats (syntax) can include markup to enhance meaning of document’s content machine readable Volker Haarslev Department of Computer Science and Software Engineering Concordia University, Montreal 1 Basic Web Technology (2) Resource Description Framework (RDF) make machine-processable statements triple of URIs: subject, predicate, object intended for information from databases Ontology Web Language (OWL) based on RDF description logics (as part of automated reasoning) syntax is XML knowledge representation in the web What is Knowledge Representation? How would one argue that a person is an uncle? Jil We might describe family has_parent has_child relationships by a relation has_parent and its inverse has_child Joe Sue Now can can define an uncle a person (Joe) is an uncle if has_child and only if he is male he has a parent (Jil) and this ? parent has a second child this child (Sue) is itself a parent Sue is called a sibling of Joe and vice versa Volker Haarslev Department of Computer Science and Software Engineering Concordia University,
    [Show full text]
  • Talking to Computers in Natural Language
    feature Talking to Computers in Natural Language Natural language understanding is as old as computing itself, but recent advances in machine learning and the rising demand of natural-language interfaces make it a promising time to once again tackle the long-standing challenge. By Percy Liang DOI: 10.1145/2659831 s you read this sentence, the words on the page are somehow absorbed into your brain and transformed into concepts, which then enter into a rich network of previously-acquired concepts. This process of language understanding has so far been the sole privilege of humans. But the universality of computation, Aas formalized by Alan Turing in the early 1930s—which states that any computation could be done on a Turing machine—offers a tantalizing possibility that a computer could understand language as well. Later, Turing went on in his seminal 1950 article, “Computing Machinery and Intelligence,” to propose the now-famous Turing test—a bold and speculative method to evaluate cial intelligence at the time. Daniel Bo- Figure 1). SHRDLU could both answer whether a computer actually under- brow built a system for his Ph.D. thesis questions and execute actions, for ex- stands language (or more broadly, is at MIT to solve algebra word problems ample: “Find a block that is taller than “intelligent”). While this test has led to found in high-school algebra books, the one you are holding and put it into the development of amusing chatbots for example: “If the number of custom- the box.” In this case, SHRDLU would that attempt to fool human judges by ers Tom gets is twice the square of 20% first have to understand that the blue engaging in light-hearted banter, the of the number of advertisements he runs, block is the referent and then perform grand challenge of developing serious and the number of advertisements is 45, the action by moving the small green programs that can truly understand then what is the numbers of customers block out of the way and then lifting the language in useful ways remains wide Tom gets?” [1].
    [Show full text]
  • Wikitology Wikipedia As an Ontology
    Creating and Exploiting a Web of Semantic Data Tim Finin University of Maryland, Baltimore County joint work with Zareen Syed (UMBC) and colleagues at the Johns Hopkins University Human Language Technology Center of Excellence ICAART 2010, 24 January 2010 http://ebiquity.umbc.edu/resource/html/id/288/ Overview • Introduction (and conclusion) • A Web of linked data • Wikitology • Applications • Conclusion introduction linked data wikitology applications conclusion Conclusions • The Web has made people smarter and more capable, providing easy access to the world's knowledge and services • Software agents need better access to a Web of data and knowledge to enhance their intelligence • Some key technologies are ready to exploit: Semantic Web, linked data, RDF search engines, DBpedia, Wikitology, information extraction, etc. introduction linked data wikitology applications conclusion The Age of Big Data • Massive amounts of data is available today on the Web, both for people and agents • This is what‟s driving Google, Bing, Yahoo • Human language advances also driven by avail- ability of unstructured data, text and speech • Large amounts of structured & semi-structured data is also coming online, including RDF • We can exploit this data to enhance our intelligent agents and services introduction linked data wikitology applications conclusion Twenty years ago… Tim Berners-Lee‟s 1989 WWW proposal described a web of relationships among named objects unifying many info. management tasks. Capsule history • Guha‟s MCF (~94) • XML+MCF=>RDF
    [Show full text]