<<

XTAG - A Graphical Workbench for Developing Tree-Adjoining Grammars*

Patrick Paroubek**, Yves Schabes and Aravind K. Joshi Department of Computer and Information Science University of Pennsylvania Philadelphia PA 19104-6389 USA pap/schabes/[email protected]

Abstract XTAG runs under and X Window (CLX). We describe a workbench (XTAG) for the development of tree-adjoining grammars and their parsers, and dis- 1 Introduction cuss some issues that arise in the design of the graphical interface. Tree-adjoining grammar (TAG) [Joshi et al., 1975; Joshi, Contrary to string rewriting grammars generating 1985; Joshi, 1987] and its lexicalized variant [Schabes et trees, the elementary objects manipulated by a tree- al., 1988; Schabes, 1990; Joshi and Schabes, 1991] are adjoining grammar are extended trees (i.e. trees of depth tree-rewriting systems in which the syntactic properties one or more) which capture syntactic information of lex- of words are encoded as tree structured-objects of ex- ical items. The unique characteristics of tree-adjoining tended size. TAG trees can be combined with adjoining grammars, its elementary objects found in the ~lexicon and substitution to form new derived trees. 1 (extended trees) and the derivational history of derived Tree-adjoining grammar differs from more traditional trees (also a tree), require a specially crafted interface in tree-generating systems such as context-free grammar in which the perspective has Shifted from a string-based to two ways: a tree-based system. XTAG provides such a graphical 1. The objects combined in a tree-adjoining grammar interface in which the elementary objects are trees (or (by adjoining and substitution) are trees and not tree sets) and not symbols (or strings). strings. In this approach, the lexicon associates with The kernel of XTA G is a predictive left to right parser a word the entire structure it selects (as shown in for unification-based tree-adjoining grammar [Schabes, Figure 1) and not just a (non-terminal) symbol as 1991]. XTAG includes a graphical editor for trees, a in context-free grammars. graphical tree printer, utilities for manipulating and displaying feature structures for unification-based tree- 2. Unlike string-based systems such as context-free adjoining grammar, facilities for keeping track of the grammars, two objects are built when trees are com- derivational history of TAG trees combined with adjoin- bined: the resulting tree (the derived tree) and its ing and substitution, a parser for unification based tree- derivational history (the derivation tree). 2 adjoining grammars, utilities for defining grammars and These two unique characteristics of tree-adjoining lexicons for tree-adjoining grammars, a morphological grammars, the elementary objects found in the lexicon recognizer for English (75 000 stems deriving 280 000 in- (extended trees) and the distinction between derived tree flected forms) and a tree-adjoining grammar for English and its derivational history (also a tree), require a spe- that covers a large range of linguistic phenomena. cially crafted interface in which the perspective must Considerations of portability, efficiency, homogeneity be shifted from a string-based to a tree-based system. and ease of maintenance, lead us to the use of Common Lisp without its object language addition and to the use 1We assume familiarity throughout the paper with the of the X Window interface to Common Lisp (CLX) for definition of TAGs. See the introduction by Joshi [1987] the implementation of XTAG. for an introduction to tree-adjoining grammar. We refer the reader to Joshi [1985], Joshi [1987], Kroch and Joshi [1985], XTA G without the large morphological and syntactic Abeill~ et al. [1990a], Abeill~ [1988] and to Joshi and Schabes lexicons is public domain software. The large morpho- [1991] for more information on the linguistic characteristics logical and syntactic lexicons can be obtained through of TAG such as its lexicalization and factoring recursion out an agreement with ACL's Data Collection Initiative. of dependencies. 2The TAG derivation tree is the basis for semantic inter- *This work was partially supported by NSF grants DCR- pretation [Shieber and Schabes, 1990b], generation [Shieber 84-10413, ARO Grant DAAL03-87-0031, and DARPA Grant and Schabes, 1991] and machine translation [Abeill~ et al., N0014-85-K0018. 1990b] since the information given in this data-structure is **Visiting from the L aboratoire Informatique Th~orique et richer than the one found in the derived tree. Furthermore, Programmation, Institut Blaise Pascal, 4 place Jussieu, 75252 it is at the level of the derivation tree that ambiguity must PARIS Cedex 05, France. be defined.

223

s rithm [Chalnick, 1989]. The algorithm is an im- provement of the ones developed by R.eingold and

NP NPo$ VP Tolford [1981] and, Lee [1987]. It guarantees in lin- ear time that tress which are structural mirror im- (0~1) ~N (/~1) V SI*NA ages of on another are drawn such that their dis- plays are reflections of one another while achieving J I minimum width of the tree. boy thi nk s • Capabilities for grouping trees into sets which can be linked to a file. This is particularly useful since lexicalized TAGs organize trees into tree-families NPo$ VP which capture all variations of a predicative lexical item for a given subcategorization frame. V NPI$ PP2 S • Utilities for editing and processing equations for unification based tree-adjoining grammar [Vijay- Shanker and ]oshi, 1988; Schabes, 1990]. take P NP2 NP0$ VP • A predictive left to right parser for unification-based tree-adjoining grammar [Schabes, 1991]. into N 2 V NPI$ • Utilities for defining a grammar (set of trees, set of I I tree families, set of lexicons) which the parser uses. account saw • Morphological lexicons for English [Karp et al., Figure 1: Elementary trees found in a tree-adjoining 1992] grammar lexicon • A tree-adjoining grammar for English that covers a large range of linguistic phenomena. XTA G provides such a graphical interface in which the 2 XTAG Components elementary objects are trees (or tree sets) and not sym- bols (or strings of symbols). The communication with the user is centralized around Skeletons of such workbenches have been previously the interface manager window (See Figure 2) which gives realized on Symbolics machines [Schabes, 1989; Schif- the user control over the different modules of XTAG. ferer, 1988]. Although they provided some insights on the architectural design of a TAG workbench, they were test I never expanded to a full fledged natural language envi- ronment because of inherent limitations (such as their lack of portability). Q Pp~ XTAG runs under Common Lisp [Steele, 1990] and it F ~~ha~utrMOVpnXL~ uses the Common LISP X Interface (CLX) to access the O eV,~x~Vpm~ graphical primitives defined by the Xll protocol. XTAG 0 ,~V'IMOVpnxl O QWlm~Vpn~. is portable across machines and Common Lisp compilers. The kernel of XTA G is a predictive left to right parser for unification-based tree-adjoining grammar [Schabes, F /mm,M~ ~Im~s~a~n~w~ ham,w~r m,OvsLm,m 0 ~Vl,tdVsl 1991]. The system includes the following components O pR0s~Vsl 0 com0~l and features: O !~v~umDVd G MimJVsl • Graphical edition of trees. The graphical display of F ~w~e]',sJNa~nml~s~lJsh,Snw~n~0Vslp~.trem

a tree is the only representation of a tree accessible @ eW0~0Vslp~ to the user. Some of the operations that can be o epnmOWlp=~ performed graphically on trees are: O =Whu~Vsll~2 0 [~Vsll~2 - Add and edit nodes. o l~L~,~ov~ap~ 0 II~1~V~I~2 - Copy, paste, move or delete subtrees. 0 l~t~m~tt~p~ F ~~}Is~ Jpn~ml~n.~0 Val./re~ - Combine two trees with adjunction or substitu- O ao,~/al tion. These operations keep track of the deriva- O aW0nx~Va]. tional history and update attributes stated in form of feature structures as defined in the Figure 2: Manager Window. framework of unification-based tree-adjoining grammar [Vijay-Shanker and Joshi, 1988]. This window displays the contents of the tree buffers - View the derivational history of a derived tree currently loaded into the system. The different functions and its components (elementary trees). of XTAG are available by means of a series of pop-up menus associated to buttons, and by means of mouse • A tree display module for efficient and aesthetic for- actions performed on the mouse-sensitive items (such as matting of a tree based on a new tree display algo- the tree buffer names and the tree names).

224

A tree editor for a tree contained in one of the tree XTAG uses a centralized clipboard for all binary op- buffer contained in the window can be called up by click- erations on trees (all operations are either unary or bi- ing over its tree name. Each tree editor manages one tree nary). These operations (such as paste, adjoin or substi- and as many tree editors as needed can run concurrently. tute) are always performed between the tree contained For example, Figure 2 holds a set of files (such in XTAG's clipboard and the current tree. The contents as Tnx0Vsl.trees) 3 which each contain trees (such as of the clipboard can be displayed in a special view-only anx0Vsl). When this tree is selected for editing, the window. window shown in Figure 3 is displayed. Files can be The request to view the derivational history of a tree handled independently or in group, in which case they result of a combining operation triggers the opening of a form a tree family (flag F to a buffer name). view-only window which displays the associated deriva- tion tree. Each node in a derivation tree is mouse- sensitively linked to an elementary tree. Since the derivational history of a derived tree depends on the elementary trees which were used to build it, inconsistency in the information displayed to the user could arise if the user attempts to modify an elementary tree which is being used in a derivation. This problem is solved by ensuring that, whenever a modifying operation is requested, full consistency is maintained between all the views. For instance, editing a tree used in a deriva- tion tree will break the link between those two. Thus consistency is maintained between the derived tree and the derivation tree. : <'> I Figure 4 shows an example of a derived tree (leftmost ["p" -" <~' (]] i"" : ~ [] I ~,o,~ : <:,> [] j window) with its derivation tree window (middle win- dow) and an elementary tree participating in its deriva- tion (rightmost window). As is shown in Figure 3, the tree display module handles the bracketed display of feature structures (in unification-based TAG, each node is associated two fea- V~ ture structures: top and bottom, see Vijay-Shanker and <~, [] [] Joshi [1988] for more details). The tree formatting al- gorithm guarantees that trees that are structural mirror images of on another are drawn such that their displays are reflections of one another [Chalnick, 1989]. A unifi- cation module handles the updating of feature structures for TAG trees. XTAG includes a predictive left to right parser Figure 3: A tree-editing window for the tree anx0Vsl. for unification-based tree-adjoining grammar [Schabes, 1991]. The parser is integrated into XTAG and deriva- All the editing and visualizing operations are per- tions are displayed by the interface as illustrated in Fig- formed through this window (see Figure 3). Some of ure 4. The parser achieves an O(G~n6)-time worst case them are: behavior, O(G2n4)-time for unambiguous grammars and linear time for a large class of grammars. The parser • Add and edit nodes. uses the following two-pass parsing strategy (originally • Copy, paste, move or delete subtrees. defined for lexicalized grammars [Schabes et al., 1988]) which improves its performance in practice [Schabes and • Combine two trees with adjunction or substitu- Joshi, 1990]: tion. These operations keep track of the deriva- • In the first step the parser will select, the set of tional history and update attributes stated in form structures corresponding to each word in the sen- of feature structures as defined in the framework tence. Each structure can be considered as encoding of unification-based tree-adjoining grammar [Vijay- a set of 'rules'. Shanker and Joshi, 1988]. • In the second step, the parser tries to see whether • View the derivational history of a derived tree and these structures can be combined to obtain a well- its components (elementary trees). formed structure. In particular, it puts the struc- • Display and edit feature structures. tures corresponding to arguments into the struc- tures corresponding to predicates, and adjoins, it • Postscript printing of the tree. needed, the auxiliary structures corresponding to 3The particular conventions for the tree and family names adjuncts to what they select (or are selected) for. reflect the structure of the trees and they can be ignored by This step is performed with the help of a chart in the reader. the fashion of Earley-style parsing.

225

PP/ Sr

V $ Sir

,a~[vn~m]{1.21 I~'[d~](21 ~¢~Ia~l {zl) PPz Sr ~ N d~ NP VP NA a~[t~l {l} I Pt NPI~ ' r~0~ w, D N V PP I V PP I I I I I

~ [~--~ [~--~ 1~---~--~ ~] ~ ~ Ir~shape+ftt] []

Figure 4: left, a derived tree, middle, its derivation, right, an elementary tree participating in the derivation.

The first step enables the parser to select a relevant alyzer and a syntactic lexicon, which is the domain of subset of the entire grammar, since only the structures structural choice, subcategorization and selectional in- associated with the words in the input string are selected formation. Lexical items are defined by the tree struc- for the parser. The number of structures filtered during ture or the set of tree structures they select. this pass depends on the nature of the input string and on characteristics of the grammar such as the number (defgrammar demol of structures, the number of lexical entries, the degree (: start-symbol "S" of lexical ambiguity, and the languages it defines. In the :start-feature " = ind") second step, since the structures selected during the first (:tree-files "lex" "modifiers" step encode the morphological value of their words (and (:type "trees")) therefore their position in the input string), the parser ( :f amily-f iles is able to use non-local bottom-up information to guide "TnxOV" "TnxOVa" "TnxOVnxl" "TnxOVdnl" its search. The encoding of the value of the anchor of "TnxOVnxlpnx2" "TnxOVpnxl" "TnxOVsl" each structure constrains the way the structures can be (:type "trees")) combined. This information is particularly useful for a (:lexicon-files "lexicon" (:type "lex")) (:example-files "examples" (:type "ex"))) top-down component of the parser [Schabes and Joshi, 1990]. Figure 5: An example of instructions for loading and XTAG provides all the utilities required for designing defining a grammar. a lexicalized TAG structured as in Schabes et al. [1988]. All the syntactic concepts of lexicalized TAG (such as the grouping of the trees in tree families which represents the The morphological lexicons for English [Karp el al., possible variants on a basic subcategorization frame) are 1992] were built with PC-KIMMO's implementation of accessible through mouse-sensitive items. Also, all the two-level morphology [Antworth, 1990] and with the operations required to build a grammar (such as load 1979 edition of Collins English Dictionary. They com- trees, define tree families, load syntactic and morpho- prise 75 000 stems deriving 280 000 inflected forms. logical lexicon) can be predefined with a macro-like lan- XTAG also comes with a tree-adjoining grammar for guage whose instructions can be loaded from a file (See English [Abeill@ et al., 1990a] which covers a large range Figure 5). of linguistic phenomena. The entries for lexical items of all types belong to the The grammar writer has also the option to manually syntactic lexicon and are marked with features to con- test a derivation by simulating adjoining or substitution strain the form of their arguments. For example, a verb of trees that are associated with words defined in the which takes a sentential argument uses features to con- lexicon. strain the form of the verb acceptable in the complement The grammar consists of a morphological English an- clause. An interesting consequence of TAG's extended

226

domain of locality is that features imposed by a clausal did not use it mainly because it was known to work only lexical item can be stated directly on the subject node with the PCL version from Xerox Parc (we want to re- as well as on the object node. These features need not main as compatible as possible between the different di- be percolated through the VP node as in context-free alects of Common Lisp), and was not robust enough. formalisms. Although WINTERP has many interesting points for When a word can have several structures, correspond- our purpose, we did not choose it because we wanted ing to different meanings, it is treated as several lexical to have a complete and efficient (i.e. a compiler) Com- items with different entries in the syntactic lexicon. Mor- mon Lisp implementation. WINTERP is an interpre- phologically, such items can have the same category and tive, interactive environment for rapid prototyping and the same entry in the morphological lexicon 4. Examples application writing using the OSF toolkit [Young, of syntactic entries follow: 5 1990]. It uses a mini-lisp interpreter (called XLISP; it is not available as a compiler) to glue together various - cut INDEX : implemented primitive operations ( [Nye, 1988] and ENTRY: NP0 cut into NP1 P0S: NP0 V P1 NPI Xtk [Asente and Swick, 1990]) in a Smalltalk-like [Gold- FS : #inv-, #pass- berg, 1983] object system (widgets are a first class type). DEF: make an incision in. WINTERP has no copyright restrictions. Initially we were attracted by GARNET [Meyers et al., INDEX: cut 1990], mainly because it is advertised as look-and-feel ENTRY: NP0 cut down (NPI) independent and because it is implemented using only POS: NP0 V PL (NPI) Common Lisp and CLX (but not CLOS, nor any existing FS: #pass+ X toolkit such as Xtk or Motif). The system is composed DEF: consume less; reduce. of two parts: (1) a toolkit offering objects (prototype EX: "The city must cut its expenses down.' instance model [Lieberman, 1986], constraints, (2) and INDEX: accuse an automatic run-time maintenance of properties of the ENTRY: NPO accuse NPI (of NP2) graphic objects based on a semantic network. The dif- POS: NPO V NPI (P2 NP2) ferent behavior of the interface components is specified FS: #invl-, #dat-, #inv2-, #pass1+, #pass2- by binding high level interactors objects to the graphic DEF: say that somebody is guilty (of). objects. An interface builder tool (Lapidary) allows the drawing of the graphic aspects of an interface. How- INDEX: face ever we did not use GARNET because the version at ENTRY: NPO face away (from NP1) the time of the design of XTAG was very large and slow, POS: NPO V PL (P1 NP1) and still subject to changes of design. Furthermore, an- FS: #inv-, #pass- DEF: look in the opposite direction (from). other reason for not choosing GARNET was the fact EX: My house faces away from the ocean. that Carnegie Mellon University retained the copyrights, slowing the distribution. 3 The choice of a graphical package: X PICASSO [Schank et al., 1990], a more recent pack- age from Berkeley University, offers similar functionali- Window and CLX ties shared by other non Common Lisp based applica- The choice of a graphical package was motivated by con- tion frameworks like InterViews [Linton et al., 1989], siderations of portability, efficiency, homogeneity and MacApp [Schmuker, 1986] and Smalltalk [Goldberg, ease of maintenance. XTAG was built using Common 1983], but is freely distributed. It is an object-oriented Lisp and its X Window interface CLX. system implemented in Common Lisp, CLX and CLOS. We chose this rather low level approach to realize the With each type of PICASSO object is associated a CLOS interface as opposed to the use of a higher-level toolkit class, the instances of which have different graphic and for graphic interface design because the rare tools avail- interactive properties. The widgets implementing those able which were fulfilling our requirements for portabil- properties are automatically created during the creation ity, homogeneity and ease of maintenance were still un- of a PICASSO object. Unlike the two previous sys- der development at the beginning of the design of XTA G. tems, the PICASSO objects may be shared in an ex- The first package we considered was Express Window. ternal database common to different applications (per- It attracted our attention because it has precisely been sistent classes) when this is enabled, PICASSO requires created to run programs developed on the Symbolics ma- the use of the database management system INGRES chine in other Common Lisp environments. It is an im- [Charness and Rowe, 1989]. PICASSO was not available plementation of most of the Flavors and graphic prim- as a released package at the time the implementation of itives of the Symbolics system respectively in terms of XTA G started. the Common Lisp Object System (CLOS) [Keene, 1988] primitives and CLX [Scheifler and Lamott, 1989]. We 4 Programming considerations 4The lexical entries below have been simplified for the All the graphic objects of XTAG are defined as con- purpose of exposition. tacts and are implemented using only the structures of 5Syntactic lexicons can be stated in various forms. In Common Lisp and their simple inheritance mechanism. the following examples, a tree family is associated with each Because of the relatively low computing cost associated string of part of speeches (POS). with the contacts, we have been able to define every

227

graphic object of XTAG (whatever its complexity as a • text editing not requiring multi-fonts (e.g. multi- contact is) without having to resort to a different proce- line comments, unification equations). dure oriented implementation for simpler objects as was For the former, we implemented all the editing func- done in InterViews with the painter objects [Linton et tions ourselves because they do not require much pro- al., 1989]. cessing and multi-font support was unavailable. For the The programming difficulties we have encountered latter, we used system calls to an external editor (emacs deal with re-synchronizing XTAG with the server during in our case). a transfer of control between contacts (the active con- Concerning the programming task, we would have tact is the one containing the cursor). These difficulties liked to have available tools to help us write an X ap- stem from the asynchronous nature of the communica- plication in Common Lisp at a level slightly higher than tion protocol of X and from the large number of events the one of the CLX interface without going up to the mouse motion may generate when the cursor is moved level of elaborate toolkits like GARNET or PICASSO over closely located windows. The fact that windows which implies the use of a complex infra-structure, per- may be positioned anywhere and stacked in any order haps something like an incremental or graded toolkit. (overlapping windows) makes the handling of those tran- Our next developments effort will be concerned with sitions a non trivial task. A careful choice of the event- introducing parallelism in the interface (actors), adding masks attached to the windows is by itself insufficient to new features like an undo mechanism (using the Item solve the problem. To limit the number of queries made list data structure proposed by Dannenberg [1990]), and to the server, we use extensive caching of graphic prop- extending XTAG for handling meta-rules [Becker, 1990] erties. The structures implementing the contacts con- and Synchronous TAGs [Shieber and Schabes, 1990b] tain fields that duplicate server information. They are which are used for the purpose of automatic trans- updated when the graphic properties of the object they lation [Abeill~ et ai., 1990b] and semantic interpreta- describe are changed. We found this strategy to improve tion [Shieber and Schabes, 1990a; Shieber and Schabes, the performance noticeably. This feature can easily be 1991]. turned off, in case a particular X-terminai or workstation would provide hardware support for caching. 5 Requirements for Running XTAG While we put a lot of attention on issue of portabil- ity, we did not worry about look independence, limiting Version 0.93 of XTAG is available as pub/xtagO. 93.tar.Z by anonymous ftp to linc.cis.upenn.edu (130.91.6.8). 6 the user possibilities in this domain to geometric dimen- sion parameterization and font selection by means of a XTA G requires: configuration file and a few menus. • A machine running and XllR4. 7 Our current implementation uses the twin window • A Common Lisp compiler which supports the latest manager, but another window manager could also be definition of Common Lisp [Steele, 1990]. Although used. We have found the need for multi-font string sup- XTAG should run under a variety of lisp compil- port for XTAG because the tree names and node labels ers, it has only been tested with LUCID Common require a mix of at least two or three fonts (ascii sym- Lisp 4.0 and 4.0.1. A version bols and greek symbols such as c~, fl and e, and a font for running under KCL is currently under development. subscripts). We could have used a font which contains • Common Lisp X Interface (CLX) version 4 or all the characters may use the same font as the normal higher .s differ from those only by their location to the writing XTAG has been tested on UNIX based machines line), but we preferred to define a multi-font composed (R4.4) (SPARC station 1, SPARC station SLC, HP of several existing fonts (which can be customized by the BOBCATs series 9000 and SUN 4 also with NCD X- user) for portability purposes and to leave open the way terminals) running XllR4 and CLX with Lucid Com- for future extensions. mon Lisp (4.0) and Allegro Common Lisp (4.0.1). In order to be able to scroll over the trees when they are too big to be displayed in a window, every tree editor 6 Conclusion window is associated with an eight direction touch-pad (inspired from the mover of InterViews [Linton et al., We described a workbench for the development of tree- 1989]). adjoining grammars and their parsers and discussed The nodes displayed in the window of a tree editor some issues that arise in the design of the graphic in- are not sensitive to the presence of the cursor, they re- terface. act only to mouse button clicks. During earlier versions The unique characteristics of tree-adjoining gram- of XTAG we highlighted the visited node with a border, mars, its elementary objects found in the lexicon (ex- but this required too much overhead because of the nu- tended trees) and the derivational history of derived merous cursor motions over the tree window which occur 6Newer versions of the system are copied into the same during editing. directory as they become available. The text editing task we had to implement fall into rPrevious releases of X will not work with XTAG. X11R4 two classes: is free software available from MIT. 8CLX is the Common Lisp equivMent to the Xlib package • short line editing requiring multi-fonts (e.g. edition for C. It allows the lisp programmer to use the graphical of node names); primitives of Xlib within Common Lisp.

228

trees (also a tree), require a specially crafted interface [Abeilld, 1988] Anne Abeill& A lexicalized tree adjoin- in which the perspective is shifted from a string-based ing grammar for French: the general framework. Tech- to a tree-based system. XTAG provides such a work- nical Report MS-CIS-88-64, University of Pennsylva- bench. nia, 1988. The kernel of XTA G is a predictive left to right parser [Antworth, 1990] Evan L. Antworth. PC-KIMMO: a for unification-based tree-adjoining grammar [Schabes, two-level processor for morphological analysis. Sum- 1991]. XTAG includes a graphical editor for trees, a mer Institute of Linguistics, 1990. graphical tree printer based on a new algorithm, utili- ties for manipulating and displaying feature structures [Asente and Swick, 1990] P. J. Asente and R. R. Swick. for unification-based tree-adjoining grammar, facilities Toolkit. Digital Press, 1990. for keeping track of the derivational history of TAG trees [Becket, 1990] T. Becket. Meta-rules on tree adjoining combined with adjoining and substitution, a parser for grammars. In Proceedings of the 1st International unification based tree-adjoining grammars, utilities for Workshop on Tree Adjoining Grammars, Dagstuhl defining grammars and lexicons for tree-adjoining gram- Castle, FRG, August 1990. mars, a morphological recognizer for English (75 000 stems deriving 280 000 inflected forms) and a tree- [Chalnick, 1989] Andrew Chalnick. Mirror image dis- adjoining grammar for English that covers a large range play of n-ary trees. Unpublished manuscript, Univer- of linguistic phenomena. sity of Pennsylvania, 1989. XTAG without the large morphological and syntactic [Charness and Rowe, 1989] D. Charness and L. Rowe. lexicons is public domain software. The large morpho- CLING/SQL - Common Lisp to INGRES/SQL in- logical and syntactic lexicons can be obtained through terface. Technical report, U.C. Berkeley, Computer an agreement with ACL's Data Collection Initiative. Science Division, 1989. XTAG runs under Common Lisp and X Window [Dannenberg, 1990] R. B. Dannenberg. A structure for (CLX). efficient update, incremental redislay and undo in graphical editors. Graphical Editors, Software Prac- Acknowledgments tice and Experience, 20(2), February 1990. Many people contributed to the development of XTAG. [Goldberg, 1983] A. Goldberg. Smalltalk-80: The Inter- We are especially grateful to Mark Liberman for his active Programming Environment. Addison Wesley, help with the extraction of morphological and syntactic 1983. information available in the data collected by the ACL's Data Collection Initiative. [Joshi and Schabes, 1991] Aravind K. Joshi and Yves We have benefitted from discussions with Evan Schabes. Tree-adjoining grammars and lexicalized Antworth, Lauri Karttunen, Anthony Kroch, Fernando grammars. In Maurice Nivat and Andreas Podel- Pereira, Stuart Shieber and Annie Zaenen. ski, editors, Definability and Recognizability of Sets of Anne Abeille, Kathleen Bishop, Sharon Cote, Beat- Trees. Elsevier, 1991. Forthcoming. rice Daille, Jason Frank, Caroline Heycock, Beth Ann [Joshi et al., 1975] Aravind K. Joshi, L. S. Levy, and Hockey, Megan Moser, Sabine Petillon and Raffaella M. Takahashi. Tree adjunct grammars. Journal of Zanuttini contributed to the design of the English and Computer and System Sciences, 10(1), 1975. French TAG grammars. [Joshi, 1985] Aravind K. Joshi. How much context- The morphological lexicons for English were built by sensitivity is necessary for characterizing struc- Daniel Karp. tural descriptions--Tree Adjoining Grammars. In Patrick Martin automated the acquisition of some of D. Dowty, L. Karttunen, and A. Zwicky, editors, Natu- syntactic lexicons with the use of on-line dictionaries. ral Language Processing--Theoretical, Computational We would like to thank Jeff Aaronson, Tilman Becker, and Psychological Perspectives. Cambridge University Mark Foster, Bob Frank, David Magerman, Philip Press, New York, 1985. Resnik, Steven Shapiro, Martin Zaidel and Ira Winston for their help and suggestions. [Joshi, 1987] Aravind K. Joshi. An Introduction to Tree Adjoining Grammars. In A. Manaster-Ramer, editor, References Mathematics of Language. John Benjamins, Amster- dam, 1987. [Abeill~ et al., 1990a] Anne AbeiU~, Kath- leen M. Bishop, Sharon Cote, and Yves Schabes. A [Karp et al., 1992] Daniel Karp, Yves Schabes, and lexicalized tree adjoining grammar for English. Tech- Martin Zaidel. Wide coverage morphological lexicons for English. Submitted to the 14 th International Con- nical Report MS-CIS-90-24, Department of Computer and Information Science, University of Pennsylvania, ference on Computational Linguistics (COLING'92), 1990. 1992. [Abeill~ et al., 1990b] Anne Abeill~, Yves Schabes, and [Keene, 1988] S. Keene. Object-Oriented Programming Aravind K. Joshi. Using lexicalized tree adjoining in Common Lisp. Addison-Wesley, 1988. grammars for machine translation. In Proceedings of [Kroch and Joshi, 1985] Anthony Kroch and Aravind K. the 13 ~h International Conference on Computational Joshi. Linguistic relevance of tree adjoining gram- Linguistics (COLING'90), Helsinki, 1990. mars. Technical Report MS-CIS-85-18, Department

229

of Computer and Information Science, University of am Lehrstuhl fiir Informatik, Universit~ des Saarlan- Pennsylvania, April 1985. des, June 1988. [Lee, 1987] Albert Lee. Performance oriented tree- [Schmuker, 1986] K. J. Schmuker. Mac-app: An appli- display package. University of Pennsylvania Senior cation framework. Byte, August 1986. Thesis, 1987. [Shieber and Schabes, 1990a] Stuart Shieber and Yves [Lieberman, 1986] H. Lieberman. Using prototypical ob- Schabes. Generation and synchronous tree adjoining jects to implement shared behavior in object oriented grammars. In Proceedings of the fifth International systems. In A CM Conference on Object-Oriented Pro- Workshop on Natural Language Generation, Pitts- gramming Systems Languages and Applications, 1986. burgh, 1990. [Linton et al., 1989] M. A. Linton, J. M. Vlissides, and [Shieber and Schabes, 1990b] Stuart Shieber and Yves P. R. Calder. Composing user interfaces with inter- Schabes. Synchronous tree adjoining grammars. In views. IEEE Computer, February 1989. Proceedings of the 13 th International Conference on [Meyers et al., 1990] B. A. Meyers, D. Giuse, B. Dan- Computational Linguistics (COLING'90), Helsinki, nenberg, and V. B. Zanden. The garnet toolkit refer- 1990. ence manuals: Support for highly-interactive, graphi- [Shieber and Schabes, 1991] Stuart Shieber and Yves cal user interfaces in lisp. Technical Report CMU-CS- Schabes. Generation and synchronous tree adjoin- 90-117, Carnegie Mellon University, 1990. ing grammars. Computational Intelligence, 4(7), 1991. [Nye, 1988] A. Nye. Xlib Progamming Manual (Vol. 1), forthcoming. Xlib Reference Manual (Vol. P). O'Reilly and Asso- [Steele, 1990] Guy L. Jr. Steele, editor. Common LISP- ciates, 1988. the Language. Digital Press, second edition, 1990. [Reingold and Tolford, 1981] E.M. Reingold [Vijay-Shanker and Joshi, 1988] K. Vijay-Shanker and and John S. Tolford. Tidier drawing of trees. IEEE Aravind K. Joshi. Feature structure based tree ad- Transactions on Software Engineering, SE-7:223-228, joining grammars. In Proceedings of the 12 th In- 1981. ternational Conference on Computational Linguistics [Schabes and Joshi, 1990] Yves Schabes and Aravind K. (COLING'88), Budapest, August 1988. Joshi. Parsing with lexicalized tree adjoining gram- [Young, 1990] D. A. Young. OSF/MOTIF Reference mar. In Masaru Tomita, editor, Current Issues in Guide. Prentice, 1990. Parsing Technologies. Kluwer Accadernic Publishers, 1990. [Schabes et al., 1988] Yves Schabes, Anne Abeill~, and Aravind K. Joshi. Parsing strategies with 'lexicalized' grammars: Application to tree adjoining grammars. In Proceedings of the 12 ~h International Conference on Computational Linguistics (COLING'88), Budapest, Hungary, August 1988. [Schabes, 1989] Yves Schabes. TAG system user manual for Symbolics machines, 1989. [Schabes, 1990] Yves Schabes. Mathematical and Com- putational Aspects of Lexicalized Grammars. PhD the- sis, University of Pennsylvania, Philadelphia, PA, Au- gust 1990. Available as technical report (MS-CIS-90- 48, LINC LAB179) from the Department of Computer Science. [Schabes, 1991] Yves Schabes. The valid prefix property and left to right parsing of tree-adjoining grammar. In Proceedings of the second International Workshop on Parsing Technologies, Cancun, Mexico, February 1991. [Schank et al., 1990] P. Sehank, J. Konstan, C. Liu, A.R. , S. Seitz, and B. Smith. Picasso reference manual. Technical report, University of California, Berkeley, 1990. [Scheifler and Lamott, 1989] W. Seheifler, R. and O. Lamott. Clx - common lisp x interface. Technical report, Texas Instruments, 1989. ISchifferer, 19881 Klaus Schifferer. TAGDevENV eine Werkbank fiir TAGs. Technical report, KI - Labor

230