GATE Template

GATE Template

SHUMI Support to Human Machine Interaction Project number Project title SHUMI Deliverable type Final Report restricted to project participants Title of deliverable Detailed study accompanying Final Report Release number 1.0 Contractual date of delivery July 5th, 2004 Actual date of delivery September 24th, 2004 Nature of the deliverable Technical Report Author(s) Maria Teresa Pazienza Organization University of Roma Tor Vergata Marco Pennacchiotti Michele Vindigni Fabio Massimo Zanzotto SHUMI: Final Report List of contents 1. SCOPE OF THIS DOCUMENT............................................................................................................ 5 2. INTRODUCTION ................................................................................................................................... 6 2.1 NATURAL LANGUAGE ORIENTED INFORMATION SYSTEMS............................................................... 6 2.1.1 Information Retrieval ................................................................................................................ 6 2.1.1.1 Basic Concepts ...................................................................................................................... 6 2.1.1.2 Documents representation ..................................................................................................... 7 2.1.1.3 The retrieval process.............................................................................................................. 7 2.1.1.4 Evaluation.............................................................................................................................. 8 2.1.1.5 Approaches to IR................................................................................................................... 9 2.1.1.6 The TREC Conference ........................................................................................................ 11 2.1.1.7 Interesting Readings ............................................................................................................11 2.1.2 Information Extraction ............................................................................................................ 12 2.1.2.1 Basic Concepts .................................................................................................................... 12 2.1.2.2 IE and IR.............................................................................................................................. 12 2.1.2.3 Types of IE .......................................................................................................................... 13 2.1.2.4 A generic IE system.............................................................................................................16 2.1.2.5 The Scenario........................................................................................................................ 17 2.1.2.6 IE applications and integration............................................................................................ 18 2.1.2.7 Past, present and future........................................................................................................18 2.1.2.8 Interesting Readings ............................................................................................................19 2.1.3 Question Answering................................................................................................................. 20 2.1.3.1 Basic Concepts .................................................................................................................... 20 2.1.3.2 A generic Q/A architecture.................................................................................................. 21 2.1.3.3 Q/A, IR and IE..................................................................................................................... 23 2.1.3.4 Past, present and future........................................................................................................24 2.1.3.5 -Interesting Readings...........................................................................................................24 2.1.4 A global view of Information Systems approaches.................................................................. 25 3. AUTOMATIC ASSISTANT FOR INFORMATION ACCESS: RTV PROPOSALS .................... 26 3.1 INFORMATION RETRIEVAL AND CLUSTERING ................................................................................. 28 3.2 “ACTIVE” INFORMATION ACCESS SYSTEM ..................................................................................... 30 3.3 INTEGRATING SEMANTIC MODELS: GENERIC LINGUISTIC KNOWLEDGE AND SPECIFIC DOMAIN KNOWLEDGE ................................................................................................................................................ 31 3.4 LEARNING THE “MISSION ONTOLOGY” ............................................................................................ 31 3.4.1 Prototypical SHUMI Mission Ontology .................................................................................. 32 3.4.1.1 Terminology Extractor ....................................................................................................... 33 3.4.1.1.1 Architecture................................................................................................................... 33 3.4.1.1.2 Validation procedure..................................................................................................... 35 3.4.1.1.3 Document Collection structuring .................................................................................. 36 3.4.1.1.4 Comparative term analysis............................................................................................ 37 3.4.1.2 Relation Extractor................................................................................................................ 41 3.4.1.2.1 Architecture................................................................................................................... 41 3.4.1.2.2 Validation procedure..................................................................................................... 43 3.4.1.2.3 Comparative Surface Form Analysis ............................................................................ 43 3.4.1.2.4 Surface Forms and related sentences............................................................................. 49 4. ENABLING METHODOLOGIES ...................................................................................................... 57 4.1 BAG-OF-WORD ABSTRACTION ........................................................................................................ 57 4.2 INTERMEDIATE SYNTACTIC-SEMANTIC LANGUAGE INTERPRETATION .......................................... 58 4.2.1 Robust parsing......................................................................................................................... 58 4.2.1.1 Introduction ......................................................................................................................... 58 Page 2 / 123 SHUMI: Final Report 4.2.1.1.1 Robustness in NL Parsing ............................................................................................. 59 4.2.1.1.2 Robustness in NL Parsing: the attempt of an empirical definition................................ 60 4.2.1.2 A modular, possibly pipelined, and lexicalised architecture for robust natural language parsing 61 4.2.1.2.1 Modular approaches: robust redundant voting policies vs. computationally attractive cascades 61 4.2.1.2.2 Grammars, Lexicons and "self"-adaptable components................................................ 62 4.2.1.3 Parsing Engineering in the practice: CHAOS, a pool of syntactic processors .................... 63 4.2.1.3.1 Decomposition principles in CHAOS ........................................................................... 63 4.2.1.3.2 An unifying formalism: XDG ....................................................................................... 64 4.2.1.3.3 The module pool............................................................................................................ 65 4.2.1.3.3.1 Grammar-driven components................................................................................. 65 4.2.1.3.3.2 Self-adaptable components..................................................................................... 66 4.2.2 Robust semantic analysis......................................................................................................... 67 4.2.2.1 Towards "Linguistic Interfaces" to Domain Ontologies...................................................... 68 4.2.2.2 Semantic interpretation through "Linguistic Interfaces" ..................................................... 69 4.3 LEARNING DOMAIN ONTOLOGIES AND “LINGUISTIC INTERFACES” ............................................... 72 4.3.1 Ontology Extraction from Plain Texts..................................................................................... 72 4.3.1.1 Mapping lexical knowledge bases and domain concept hierarchies ................................... 74 4.3.1.1.1 Inspiring principles........................................................................................................ 75 4.3.1.1.2 Mapping domain concepts to lexical senses.................................................................. 76 4.3.1.2 Acquiring relational concepts from texts............................................................................

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    123 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us