Computational Cognition
Total Page:16
File Type:pdf, Size:1020Kb
Computational Cognition Integrated DBS Software Design for Data-Driven Cognitive Processing December 9, 2019; revised July 5, 2020 ROLAND HAUSSER Friedrich Alexander Universität Erlangen Nürnberg (em.) email: [email protected] c Roland Hausser, August 8, 2020 Start May 07, 2017 CC1 July 29, 2017 CC2 March 05, 2018 CC3 June 27, 2018 CC4 August 03, 2018 CC5 September 20, 2018 CC6 October 01, 2018 CC7 November 01, 2018 CC8 December 03, 2018 CC9 January 02, 2019 CC10 February 16, 2019 CC11 March 11, 2019 CC12 April 5, 2019 CC13 May 2, 2019 CC14 June 14, 2019 CC15 July 2, 2019 CC16 December 9, 2019 CC17 July 5, 2020 (main changes in Chaps. 9, 12, 13) Pen name given to the author by Professor Inseok Yang, President of the Korean Society of Linguistics, Seoul 1982. Preface It is shown by the history of science that a basic property inadvertently omitted in the beginning can not be added post hoc. Therefore a theory of computa- tional cognition should aim from the outset to be as complete as possible and draw from all three branches of modern science, i.e. the natural sciences, the engineering sciences, and the humanities (grammar, philosophy of language). The natural and the engineering sciences have long evolved a symbiotic re- lationship, but the humanities still stand apart. Designing and building a talk- ing robot, however, is a challenge for which all three are needed. Agent-based Database Semantics (DBS) integrates recognition and action interfaces, an on- board orientation system, a content-addressable database, a data structure of nonrecursive feature structures with ordered attributes, and a linear algorithm. IMODEL IN AGENT OR AGENT IN MODEL? The empirical scope of a scientific theory is determined by its ontology. The DBS ontology for building a computational model of cognition is agent-based like the natural prototype: the artificial agent looks out into the world and inter- acts with it autonomously. The world is given in the form of raw data and any interaction is solely by means of (i) the agent’s interfaces for recognition and action, and by (ii) cognition-internal reasoning on content stored in memory. In this sense, the agent-based ontology places the model inside the agent. The alternative ontology is sign-based: an introspector-definer analyzes the intuitive relation between language signs and an abstract universe of discourse, called model. If the model were to include an agent, it would be defined like tables or chairs, i.e. as a virtual, immaterial doll without any interfaces for au- tonomous recognition and action; what it perceives or does would be entirely a matter of definition in a metalanguage by the introspector-definer. In this sense, agents are inside the models of a sign-based ontology. An agent-based ontology has a broader empirical base than a sign-based on- tology: (i) artificial interfaces for the sensory modalities, (ii) content in mem- ory resonating with current processing, (iii) switching between the speak and the hear mode, and (iv) switching between language and nonlanguage cogni- i ii Preface tion are indispensable in a DBS robot, but abstracted away from in sign-based systems. Conversely, instead of the sign-based definition of artificial models in a metalanguage (if-then conditionals based on truth values), the agent-based ontology is more realistic because it treats the world as given. II DATA-DRIVEN OR SUBSTITUTION-DRIVEN? The dichotomy between an agent-based and a sign-based ontology is com- plemented by the dichotomy between a data-driven and a substitution-driven computation. The input to a data-driven system is provided by recognition, memory, or a preceding operation; the output is content for action, provided by current processing or blueprints retrieved from memory. A substitution-driven system, in contrast, creates a hierarchy with rules which replace an abstract node with a larger (top down expansion) or smaller (bottom up reduction) expression. Building on the work of Frege (1848–1925), substitution-driven formalisms were used by Hilbert (1862–1943), Russell (1872–1970), Tarski (1901–1983), and Gödel (1906–1978) for the develop- ment of axiomatic systems and resulted in recursion, automata, and computa- tional complexity theory, such as the rewrite systems of Post (1936). When Chomsky borrowed the substitution-driven approach for analyzing natural language in the form of “generative grammar,” he ran into the problem that the extremely successful original was not designed to provide distinctions between recognition and action, the speak and the hear mode, and language and nonlanguage content. Also, its ‘vertical’ derivation order is inherently in conflict with the ‘horizontal’ time-linear structure of natural language, arti- ficially creating the “problem of serialization” for generative grammar. The way out was using the same start symbol S for randomly computing different substitutions to generate “base structures” alleged to be innate and universal. DBS, in contrast is data-driven. In the hear mode, the ordered input of con- crete agent-external surfaces is lexically analyzed as proplets which are con- nected into content with the classical semantic relations of structure coded by address. Navigating along the semantic relations between the order-free pro- plets of a content activates the proplets traversed, making them input to the language-dependent surface realization of the speak mode. Regarding computational complexity, data-driven LA-grammar provides the first, and so far the only, formal language hierarchy (TCS) which is orthogonal to the Chomsky hierarchy of substitution-driven Phrase Structure Grammar. It has been shown (FoCL Sects. 12.5, 22.3) that the natural languages are in the language class of C1-LAGs, which parses in linear time. Preface iii III MEDIA AND THEIR DUAL MODALITIES The interaction between the agent and the raw data provided by its environ- ment is based on the sensory media. Each medium has two modalities, one for recognition and one for action. For example, the sensory medium of speech has the dual sensory modalities of vocalization for action and audition for recog- nition, writing has the dual sensory modalities of manipulation for action and vision for recognition, and accordingly for Braille and signing (11.2.1). In addition to the sensory media and modalities there are the processing me- dia and modalities (11.2.2), which deal solely with content (no raw data). Ex- amples are (i) natural cognition based on the electrochemical medium and (ii) artificial cognition based on the medium of a programming language. The dual processing modalities of a DBS robot are (a) the declarative (alphanumeric) commands in a programming language, written by software engineers to be interpreted for recognition and action by a computer, and (b) the procedural (electronic) modality for executing the declarative commands automatically. IV FUNCTIONAL EQUIVALENCE AND UPSCALING CONTINUITY The cognition of an artificial agent must be functionally equivalent to the nat- ural prototype at a certain level of abstraction. For example, if the artificial agent is able to spontaneously call a color or a geometric shape the same as a human, functional equivalence is achieved. Similarly for the compositional as- pect: if the artificial agent understands the difference between the dog bit the man and the man bit the dog the same as an English-speaking human, there is functional equivalence. In such instances proper functioning may be verified by tracing the software via the service channel of the artificial agent (1.1.1). That computational (1) string search, (2) pattern matching, and (3) iteration are essential for building functional equivalence in a robot does not imply that natural cognition must use these methods as well. In analogy, even though a horse and a motorcycle are functionally equivalent at a certain level of ab- straction, i.e. for one or two persons getting from A to B, the functioning of a motorcycle has no bearing on the biological analysis of a horse. Functional equivalence is counterbalanced by a second, complementary standard of adequacy, namely the upscaling continuity of test cycles. A cycle consists of (a) testing the current software version automatically on systematic data (test suite), (b) correcting all errors, and (c) extending the test suite to additional data, after which the next test cycle is started. A cycle is successful if functional equivalence is achieved. Upscaling continuity holds, if the cycle following a successful cycle is successful as well. iv Preface VABSTRACTION IN PROGRAMMING: DECLARATIVE SPECIFICATION Programming languages are in a continuous process of debugging and devel- opment (updates, new releases). This requires software maintenance which may be labor-intensive; for example, a program written in 1985 in Interlisp-D will require work by an expert in order to run in today’s Common Lisp. Also, computer programs, old or new, are difficult for humans to read and without additional explanation do not easily reveal the conceptual idea. Therefore there have long been efforts at defining programs at a higher level of abstraction such as UML (unified modeling language) and ER (entity rela- tionship model). The goal is to use notions which are meaningful to humans, but also translate into various programming languages. It turns out, however, that the general purpose aspiration of UML, ER, and similar proposals, i.e. working for any task and any programming language, causes massive overhead and may require more work than they can actually save. This holds especially for research outside of today’s well-established computational applications based on substitution. Therefore the data structure, the database schema, and the algorithm of DBS are defined directly but abstractly: they are custom-designed to handle the soft- ware tasks inherent in the cognition of a talking robot (15.2.2) in a simple, standardized, conceptually transparent, and empirically comprehensive man- ner, called declarative specification.