An Overview of Computer-Based Natural Language Processing. INSTITUTION National Aeronautics and Space Administration, Washington, D.C

DOCUMENT RESUME ED 244 607 IR 011 129 AUTHOR Gevarter, William B. TITLE An Overview of Computer-Based Natural Language Processing. INSTITUTION National Aeronautics and Space Administration, Washington, D.C. REPORT NO NASA-TM-85635 PUB DATE 83 NOTE 52p.; Best copy available. PUB TYPE Information Analyses (070) EDRS PRICE MF01/PC03 Plus Postage. DESCRIPTORS *Artificial Intelligence; *Computational Linguittict; *Computers; Computer Software; Grammar; Information Processing; *Language Processing; *Man Machine Systems; Program Descriptions; Research Projects; Semantics IDENTIFIERS Database Management Systems; *Natural Language Processing; Parsing. ABSTRACT Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with using natural lanauages (English, Japanese, German, etc.) rather than formal computer languages. NLP is a major research area in the fields -of artificial intelligence and computational linguistics. Commercial natural,language interfaces to computers have recently entered the market and the future looks bright for other applications as well. This report presents an overview of:(1) NLP applications; (2) the three basic NLP approaches; (3) various types of grammars;(4) difficulties in semantic processing of phrases and sentences; (5) knowledge representation (KR);(6) types of syntactic parsing; (7) the relationship among semantics, parsing, and understanding; (8) NLP system types, including question answering, natural language interface (NLI), computer aided instruction (CAI), discourse, _text _undarstanding,-and-text-gemeratIon systems;-(91 the" functions, approaches, capabilities, and limitations of NLP systems developed for research purposes; (10) current research NLP systems; (II) the approximate price, developer, purpose, and features of some commercial NLP systems; (12) the state of the art of NLP systems; (13) problems and issues related to language use, linguistics, conversation, processor design, database interfaces, and text understanding; (14) research required in these areas; (15) the principal United States and foreign participants in NLP, and the principal U.S. government agencies funding NLP research; (16) future trends and expectations; and (17) further sources of information. A 39-item bibliography and a glossary are also provided. (Author/ESR) *********************************************************1!************* Reproductions supplied by EDRS are the best that car be made from the original document. ******************************************************&**************** CNJ An Overview of Computer-Based Natural Language Processing National Aeronautics and Space Administration Washington, D.C. U.S. DEPARTMENT OF EDUCATION _NATIONAL INSTITUTE OF EDUCATION 1983 EDL ATIONALRESOURCES INFORMATION CENTER IERICI This document has been reproduced as 11:CelVed from the person or organization. Wiyirlating Minor changes have been made to improve reproduction rindiltV oPrmtsrOviewormmirmsstaredmthmdocu mentdorimicesswdvrepresentoMciaMIE 0ositionorigdicv U.S. DEPARTMENT OF COMMERCE National Technical InformationKra Service NASA Technical Memorandum 85635 N83-24193 An Overview of Computer-Based Natural Language Processing William B. Gevarter OFFICE OF AERONAUTICS AND SPACE TECHNOLOGY NASA Headquarters Washington, D.C. 20546 NASA National Aeronautics and Space Administration REPRODUCED -B/ Scientific and Technical NATIONA-L- TECHNIC-XL Information Branch INFORMATION SERVICE US. DEPOT-NEXT OE- COMMERCE 1983 SPRINGFIELD. VA. 72161 3 I.Report No. 2. Government Actlastors No. NASA TM-85635 3.Recipient's Catalog No-, 4 Title and Subtitle RePortDate AN OVERVIEW OF COMPUTER-BASED NATURAL LANGUAGE Aprili983 PROCESSING 6_1,4r-forming Organization Code RTC-6 7 A.AnUrIt) 8. Performing Organization Report No- William R. Gevarter 10. Work Unit No. 9 Pktorming Lrganization Name and Address National Aeronautics_and Space Administration Washington, D.C. 20546 11. COntract or Grant No. 13. Type of Report and Period U5Yered guoriciii:t4 A)encyNanie and Address National Aeronautics and Space AdMinistration Technical Memorandum Washington, D.C. 20546 14. Sponsoring Agenty Cod RTC-6 sui..;,:erpentaty Notes i6. Abstract Computer-based Natural LanguageProcessing (NLP) is the key to enabling humans and their c-omputer-bd,,cd creations to interact Withmachines in natural language (like English, Japanese, Cermmi; contrast to forriial computer languages); The doors that suchan achievernerit can open have r::ade thi_ a major researcharea in Xrtificial Intelligence and Computational 1.-inguistics. Commercial naturallanguage interfaces to computers have recentlyentered the market and the future looks brightfor other applicationsas well This report reviews the basic approachesto such systems, the techniques utilized; applications; the state-of-the-art of the technology, issuesand research requirements; the major participants, and finally, future.trencis and expectations; It is anticipated that this report will prove useful to engineering and researchmanagers; poten- dal users; and others who Will be affectedby this field as it unfolds. t7. Key Words (SuggestedbyAuthor(s)j__ ArtifiCial Intelligence, Computations8. Distribution Statement Linguistics; Natural Language, Unclassified - Unlimited Machina Translation, Data Bases, Cognitive Science, Man-Machine Inter - face ;Query systems, Computers Subject Category 61 19Security Daisif. lo1 this report) 20. SOcurit, (OithiS Unclassified page) 21. No. of Pages 22. Price Unclassified 54 A04 For sale by the National TechnicalInformation SerVite. Springfield. Virginia 22161 NASA-Langley, 1983 AN OVERVIEW OFCOMPUTER=BASED NATURAL LANGUAGEPROCESSING* PREFACE Computer-based Natural Language Processing (NLP) isthe key to enabling humans and their commit::: based creations to interact with machines in natural language(like English, Japanese; Cti.: in contrast to formal computer languages). The doors that such anachievement can opcti fiae made this a major research area in Artificial Intelligenceand Computational Linguistics: Commercial natural language interfacesto computers have recently entered the mallet arid the future looks bright for other applicationsas well,_ I his report reviews the basic approachesto such systems, the techniques utilized; applications; the state-of-the-art of the technology; issues and researchrequirements; the major participants, future trends and expectations. It is auti:ipated that this report will prove usefulto engineering and research managers, poteri- u-srs, and others who will be affected by this fieldas it unfolds. fins repo! is part of the NBS/NASA series of overviewreports on Artificial Intelligence and Robotics: Preceding page blank iii ACKNOWLEDGEMENTS I wish to thank the many people and organizations who have contributed to this report; both in providing information, and in reviewing the report and suggesting corrections, modifications and additions. I particularly would like to thank Dr. Brian Phillips of Texas Instruments; Dr: Ralph Grishman of the NRL AI Lab, Dr. Barbara Groz of SRI International, Dr. Dara Dannenberg of Coenitive Systems; Dr: Gary Hendrix of Symantec; and Ms: Vicki Roy of NBS for their thorough review of thi.3 report and their many helpful suggestions. However, the responsibility of any re- maining errors or inaccuracies must remain with the author. It is not the intent of the National Bureau of Standards or NASA to recommend or endorse any of the systems, manufacturers; or organizations mentioned in this report; but simply to attempt to provide an overview of the NLP field. However, in a growing field such as NLP, important activities and products may not have been mentioned. Lack of such mention does not in any way imply that they are not also worthwhile. The author would appreciate having any such omissions or oversights called to his attention so that they can be considered for future reports. iv 6 TABLE OF CONTENTS Page A. Introduction B. Applications 1 C. Approach 3 1. Type A: No World Models 3 a. Key Words or Patterns 3 b. Limited Logic Systems 3 2. Type B: Systems That Use Explicit WdrldMbdels 3 3. Type C: Systems That Include InformationAbout the Goals and Beliefs in Intelligent Entities I). The Parsing Piliblein 3 E. Grammar 4 1. Phrase-Structure GrammarCantekt Free Grammar; 4 2. Transformational Grammar 4 3. Case Grammar 4 4. Semantic Grammars 5 5. Other Grammars , 5 F. SetnantitS and the Cantankerous Aspects ofLanguage 5 I. Ivlultiple Word Senses 5 2. Modifier Attachinent 5 3. Noun-Noun Modification 6 4. Pronouns 6 5. Ellipsis and Substitution 6 6. Other Difficulties 6 C. Know. ledge Representation 6 I Procedural RepreSeritatiOnS 6 2. Declarative Representations 6 a; Logic 6 b. Semantic Networks 6 I. 3 Case Frames 6 4. Conceptual Dependency 6 5. Frame 7 6. Scripts H. Syntactic Parsing 7 I: Template Matehing 2. Transition Nets 7 3. Other Parsers 8 V 7 TABLE OF CONTENTS (continued) Page I.Semantics; Parsing and Understanding 10 J. NLP Systems 11 1: Kinds 11 a. Question Answering Systems 11 b. Natural Language Interfaces (NIA'S) 11 c. Computer-Aided Instruction (CA1) 11 d. Discourse 11 e. Text Understanding 11 f. TeXt Generatibn 12 g. System Building Tools 12 2. Research NLP Systems 12 a: Interfaces to Computer Programs 12 b. Natural Language Interfaces to Large Data Bases 13 c: Text Understanding 13 d. Text Generation 13 e: Machine Translation 13 f. Current Research NLP Systems 13 3. Commercial Systems 13 K. State of the Art 31 L. Problems and Issues 31 1. How People

Load more