An Intelligent Hypertext System
Total Page:16
File Type:pdf, Size:1020Kb
An Intelligent Hypertext System Michael Junke Hu Thesis submitted for the degree of Doctor of Philosophy of the University of London. November 1994 Department of Computer Science University College London ProQuest Number: 10045477 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. uest. ProQuest 10045477 Published by ProQuest LLC(2016). Copyright of the Dissertation is held by the Author. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code. Microform Edition © ProQuest LLC. ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 ABSTRACT This thesis investigates the apphcations of machine intelligence in information gen eration, organization, manipulation, search and retrieval. In order to alleviate and solve some problems in the present information retrieval (IR) community, and to increase the efficiency and effectiveness of information sys tems, a new data structure is proposed in the thesis. The conceptual index is external to the collection of information components (documents). It integrates the conven tional global index with a special semantic network. As a result, a much richer set of concepts, as well as the relationships between concepts, can be represented in the data structure. It is shown in the thesis that such a data structure is more suit able for sophisticated IR environments such as hypertext, and could make automatic information generation, self-adjustment and evolvement, inferencing and reasoning possible. Based on the conceptual index, a 3oft-link hypertext model is developed and investi gated. The soft-link hypertext model covers the Boolean search model, the probability model, the traditional hard-link hypertext model and the soft-link hypertext in one infrastructure. Its main features include automatic generation of the conceptual in dex, self-adjustment and evolvement, user-centered services for information retrieval and applications of machine intelligence in all aspects of the model. The soft-link hypertext model, including its state-space and all operations occurring in the space, is fuUy presented and evaluated in the thesis. It is implemented in a soft-link hypertext system called the Enhanced SuperBook and assessed in several small-scale controlled IR experiments. Its strengths, weakness, similarity with, and difference from other information models and systems are also studied in depth. Based on such an investigation, the thesis concludes that many aspects of informa tion retrieval should benefit from the extensive application of the machine intelligence. Automation in information generation, organization, self-adjustment and evolvement, and assistance for information retrieval can not only increase the efficiency and ef fectiveness of the information systems, but also represents the essence of, and should have great impact on, future generations of IR systems. ACKNOWLEDGEMENTS During the three-year Ph. D period I have constantly received support, both profes sionally and emotionally, from various people. Without them, the work presented in this dissertation would not have been possible. Firstly, I am very grateful to my supervisor. Professor Peter T. Kirstein, for the in valuable encouragement and guidance I received from him during the last 36 months. Since my first day in UCL, I have benefited enormously from his supervision of my research and the presentation of technical results. He has provided great stimulation and inspiration. I am indebted to Professor Kirstein for giving me so much of his time and knowledge, which has encouraged me greatly. I would also like to express my deepest appreciation to my supervisor, Mr. Nigel Chapman. I Hke and respect Nigel both as a supervisor and as a friend. It is he who has given me much important professional advice and pushed me to make progress towards my final goal. My implementation of the Enhanced SuperBook was based on software developed by Professor Michael Lesk and his team in Bellcore. I am indebted to him for his patient support for technical details. My sincere thanks go also to all my friends and colleagues in the Computer Science Department at UCL for their friendship and advice. In particular, I am very grateful to Mark Levene, GoH Montasser-Kohsari, Russel Winder, Paul Barker, Jon Crowcroft and Angela Sasse, to Jonathan Crompton and Steven Davey in the System Group, and to Annlette Riley, Kevin Randall, Dawn Bailey and Natalie May. My Ph. D research was sponsored by the British Council and the SBFSS. I also want to take this opportunity to express my sincere gratitude to the British Council, and to my programme officer, Ms. Grace Persuad, in the British Council. A lot of friends helped me with my experiments, both as domain experts and infor mation users. I want to thank them all for their time and effort. Finally, and most importantly, special thanks to my parents and my brother for their encouragement and support throughout the last three years, and to my close friends, Anthony Lavender and Mark Story, for their love, companionship and kind support. CONTENTS A B ST R A C T 2 ACKNOWLEDGEMENTS 3 1. Introduction 13 1.1 Background ................................................................................................... 13 1.2 My Research W ork ............................................................... 14 1.3 The State of Art .......................................................................................... 15 1.4 Notations Adopted ............................................... 15 1.5 Outline of the Dissertation .......................................................................... 16 2. Information and Information Retrieval 18 2.1 Information Space ......................................................................................... 18 2.2 Information Retrieval ................................................................................... 19 2.2.1 Traditional I R ................................................................................... 21 2.2.2 H y p e rtex t .......................................................................................... 25 3. Hypertext 27 3.1 Definition ...................................................................................................... 27 3.2 H istory ............................................................................................................. 27 3.2.1 The First Generation H ypertext ................................................... 28 3.2.2 The Second Generation of Hypertext .......................................... 29 3.2.3 The Third Generation of H ypertext ............................................. 30 3.3 Why Hypertext? .......................................................................................... 32 3.4 Models of H y p e rte x t .................................................................................... 34 3.4.1 Models of H y p e rte x t....................................................................... 34 3.4.2 Problems Existing in Present Hypertext Systems ....................... 35 3.5 Soft-Link Hypertext ................................................................................... 39 4. Conceptual Graph 44 4.1 Introduction .................................................................................................. 44 4.2 Some D efin itio n s ......................................................................................... 45 4.3 Conceptual Structure in Soft-link Hypertext ......................................... 47 5. The Conceptual Index and Its Formulation 50 5.1 Global Index and Indexing ......................................................................... 50 5.2 Local Hypertext S tru c tu re .................................................. 52 5.3 Relation between Global and Local Approaches .................................. 53 5.4 Conceptual Index — Global + L o c a l ...................................................... 55 5.5 Formulation of the Conceptual Index ..................................................... 58 5.6 Implementation & Experiments ............................................................... 60 5.6.1 Experiment 1 — Index Formulation ............................................. 61 5.7 S u m m a ry ..................................................................................................... 62 6. Spreading Activation and Learning 63 6.1 Connectionist Network ............................................................................ 63 6.1.1 Basic Principles ............................................................................ 63 6.1.2 H istory ............................................................................................ 64 6.1.3 Connectionist Network & Information Retrieval ................... 65 6.2 The Connectionist Structure in the Soft-Link Hypertext Model. 70 6.3 Automatic Index Formulation Mechanism ......................................... 77 6.3.1 Initialization — Before Indexing ................................................ 78 6.3.2 Indexing — Addition of Concepts and Conceptual Relations 78 6.3.3 Removal — Deletion of Concepts and Conceptual Relations 79 6.4 Activation Level and Activation Rules ...................................................