Array TFS Storage for Unification Grammars

Array TFS Storage for Unification Grammars

Array TFS storage for unification grammars Glenn C. Slayden A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science University of Washington 2012 Committee: Emily M. Bender Stephan Oepen Program Authorized to Offer Degree: Linguistics – Computational Linguistics Abstract Constraint-based grammar formalisms such as Head-Driven Phrase Structure Grammar (HPSG) model linguistic entities as sets of attribute-value tuples headed by types drawn from a connected multiple-inheritance hierarchy. These typed feature structures (TFSes) describe di- rected graphs, allowing for the application of graph-theoretic analyses. In particular, graph uni- fication—the computation of the most general structure that is consistent with a set of argu- ment graphs (if such a structure exists)—can be interpreted as expressing the satisfiability and combination functions for the represented linguistic entities, thus providing a principled meth- od for describing syntactic elaboration. In competent natural language grammars, however, the graphs are typically large and numerous, and computational efficiency is a key engineering concern. This thesis describes a method for the storage of typed feature structures where each TFS comprises a self-contained, contiguous memory allocation with a tabular internal struc- ture. Also detailed is an efficient unification algorithm for this storage mode. The techniques are evaluated in agree, a new managed-execution concurrent unification chart parser which supports both syntactic analysis (parsing) and surface realization (generation) within the framework of the DELPH-IN (Deep Linguistic Processing with HPSG Initiative) joint refer- ence formalism. Acknowledgements ..................................................................................................................... iv 1 Introduction .......................................................................................................................... 1 2 Typed feature structures ....................................................................................................... 4 2.1 TFS formalism .............................................................................................................. 4 2.1.1 Type hierarchy and feature appropriateness ......................................................... 5 2.1.2 Functional formalism ............................................................................................ 9 2.1.3 DAG interpretation ............................................................................................. 10 2.1.4 Contrasting the functional and graph approaches ............................................... 12 2.1.5 Well-formedness ................................................................................................. 13 2.2 Design considerations for TFS storage ....................................................................... 14 2.2.1 Node allocation and discarding ........................................................................... 15 2.2.2 Structure sharing ................................................................................................. 17 2.2.3 The feature arity problem .................................................................................... 18 2.3 Array storage TFS formalism ..................................................................................... 21 2.3.1 Notation ............................................................................................................... 23 2.3.2 Formal description .............................................................................................. 25 2.3.3 Node governance ................................................................................................. 29 2.3.4 Array TFS graphical form ................................................................................... 33 2.3.5 Properness of the out-tuple relation .................................................................... 34 2.3.6 Feature enumeration ............................................................................................ 35 2.4 Summary ..................................................................................................................... 37 3 Array TFS storage implementation .................................................................................... 38 3.1 Managed-code............................................................................................................. 38 3.1.1 Value types .......................................................................................................... 39 3.1.2 Direct access ....................................................................................................... 41 3.2 Engineering ................................................................................................................. 43 3.2.1 Root out-tuple ..................................................................................................... 43 3.2.2 4-tuple layout ...................................................................................................... 44 3.2.3 Mark assignment ................................................................................................. 46 3.2.4 Hash access ......................................................................................................... 47 3.3 Example ...................................................................................................................... 50 i 3.4 Future work in array storage ....................................................................................... 54 3.5 Summary ..................................................................................................................... 58 4 Unification ......................................................................................................................... 59 4.1 Well-formed unification ............................................................................................. 60 4.2 Prior work in linguistic unification ............................................................................. 61 4.2.1 UNION-FIND .......................................................................................................... 62 4.2.2 PATR-II: environments and structure sharing .................................................... 63 4.2.3 D-PATR .............................................................................................................. 64 4.2.4 Incremental unification ....................................................................................... 64 4.2.5 Lazy approaches .................................................................................................. 65 4.2.6 Chronological dereferencing ............................................................................... 66 4.2.7 Later work in term unification ............................................................................ 66 4.2.8 Strategic lazy incremental copy graph unification .............................................. 66 4.2.9 Quasi-destructive unification .............................................................................. 67 4.2.10 Concurrent and parallel unification ..................................................................... 69 4.3 n-way unification ........................................................................................................ 70 4.3.1 Motivation ........................................................................................................... 71 4.3.2 Procedure ............................................................................................................ 71 4.3.3 Evaluation ........................................................................................................... 73 4.3.4 Classifying unifier performance .......................................................................... 74 4.3.5 Summary ............................................................................................................. 75 4.4 Array TFS unification ................................................................................................. 75 4.4.1 COMP-ARC-LIST .................................................................................................... 76 4.4.2 Scratch field mapping ......................................................................................... 78 4.4.3 Scratch slot initialization and discarding ............................................................ 80 4.4.4 Scratch slot implementation ................................................................................ 82 4.4.5 Array TFS storage slot mappings ........................................................................ 85 4.4.6 Slot mapping selection ........................................................................................ 85 4.4.7 Disjoint feature coverage .................................................................................... 87 4.4.8 Fallback fetch ...................................................................................................... 88 4.4.9 Summary of the first pass .................................................................................... 89 ii 4.4.10 Node counting ..................................................................................................... 90 4.4.11 Writing pass .......................................................................................................

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    126 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us