A Formal Look at Dependency Grammars and Phrase-Structure

Home , Dependency grammar, Topicalization

cmp-lg/9410007 18 Oct 94 not linguistic theories that are developed in them mal andor the linguistic theories which are tell us much ab out underpinnings working Government tions of Grammar TAG as a systemcannot that b e arises replicated in naturally linguistic in theoriesThe the based central pro on cess role contextfree of of grammars the CFGs lexicalizin g lexicon We describ e in Tree MeaningText Adjoining Theory MTT and other dep endencybased linguistic theories In of wordorder denitions computational complexity of certain nonpro jectivetherefore constructions b e and compared suggest directly a to way of an incorp orating MeaningText lo Mo cality del MTM We illustrate this p oint by discussing the not dep endency and Ben Franklin SC partially supp orted by theHudson following grants PhraseStructure the A version of this pap er was presented at the Workshop on the MeaningText Theory Darmstadt Germany August b een b een Introduction The To app ear in Leo Wanner ed past and the Chomskyan Transformational on A transferred compared Pro ceedings three the basis computational approaches b oth discussion grammars there and Formal A mathematical comparison b etween the underlying formal systems will therefore not anonymous Binding has the linguistics of the two theories see are to of b een has DGs into the SurfaceSyntactic Comp onent of MTT contextfree phrasestructure the available as Gaifman Theory of Department of Computer and Information Science reviewers not Lo ok other little Owen such WordOrder sp ecically Grammars expressed in them rambow contact or While Arb eitspapiere for Current Issues in MeaningText Theory as Philadelphia Rambow at GB ARO DAAL C DARPA NJ NSF IRI University of Pennsylvania Grammar which we will henceforth refer to by a very MeaningText joshilinccisupennedu this b etween Dep endency addressed helpful The however theories have shed their original explicitly mathematical Abstract fact This is not to say that the formalisms themselves have linguistic and PA USA der and has those with GMD while grammars CFGs such the Phenomena Instead we must ask how the formalisms aect Theory insightful Aravind b een linguistic eect insights b oth discussed MTT Sp ecial comments of MTT The Joshi Grammars traditions from the authors in and and underlying and one the Pinter London CFGs those GB Consideration would suggestions tradition past working as developed the various A TAG grammar can linguistic see like formalisms and to on have eg This recent name thank the out traditions generally Nichols incarna work basis Richard on of for the was of

The formalisms that linguistic theories use for the purp ose of expressing syntactic structure can dier

in two ways Firstly the formalisms can dier in the type of representation they use A phrase

structure grammar p ostulates the existence of nonterminal syntactic categories while a dep endency

grammar do es not Secondly the linguistic theories can dier in how they use the syntactic formalism

they have chosen Early Chomskyan approaches followed a generativist approach while MTT do es

not As has b een p ointed out previously Kunze p these two issues the denition of the

formalism itself and how it is used are orthogonal It is p erfectly p ossible to dene a generative

DG see for instance Hays While the dierence b etween a generativist and a nongenerativist

approach will have profound metho dological and p erhaps philosophical implications for the resulting

linguistic theories in this pap er we will concentrate on the representational dierence in the formalisms

themselves

It has often b een observed that a key linguistic dierence b etween MTT and other dep endencybased

theories on the one hand and GB on the other hand is the central role that the lexicon plays in

MTT but not in GB see eg Sgall We claim that this dierence is not coincidental but a

mathematical prop erty of the underlying formalisms Contextfree phrasestructure grammars cannot

b e the basis of a lexiconoriented linguistic theory in a technical sense which we will dene in the next

section while dep endency grammars must b e An attempt to lexicalize CFGs leads naturally to

a more p owerful phrasestructure system called Tree Adjoining Grammars TAGs In Section we

will show that TAGs show many imp ortant similarities to DGs These similarities have two b enecial

results rstly we are able to apply formal results from the mathematical study of phrasestructure

grammars to DGs secondly we are able to transfer linguistic analyses made in one framework to the

other framework In Section we will illustrate these p oints by lo oking at several nonpro jective

syntactic constructions

Our ma jor goal in this pap er is to study the interaction b etween formal systems and linguistic theories

and to explore how results in the framework of one theory can b e expressed in the particular formal

context of other theories Such work provides insights into those asp ects linguistic and formal that

app ear to b e invariant across a class of formalisms The reader should not interpret our goal as

suggesting that MTT needs to adopt a phrasestructure representation for whatever reason

Dep endency Phrase Structure and the Lexicon

One of the most imp ortant features of MTT is the central role that the lexicon plays see eg Melcuk

and Polguere in fact much of the MTT literature deals with the lexicon For syntactic pur

p oses it contains information ab out the sub categorization frame of a lexeme and how the arguments

are realized case assignment and function words The imp ortance of the lexicon for syntactic theo

ries has also b een increasingly recognized in the American linguistic traditions We will take it as a

given and address the question how a phrase structurebased syntactic theory can b e adapted to a

lexical approach It turns out that there are intrinsic formal problems These problems have b een

investigated in detail by Schabes for a summary of some of the mathematical prop erties of tree

grammars including lexicalization see Joshi and Schabes We will provide a brief discussion

here

From a historical p ersp ective it presumably was the interest in developing a lexiconoriented linguistic theory that

led to the use of a dep endency grammar for MTT However in this pap er we take a synchronic view

TAG was originally introduced as a tree generating system on its own Joshi et al It was only recently

shown that TAGs can lexicalize CFGs In this pap er we will only b e interested in lexicalized TAGs For a general

introduction to TAGs see Joshi a

If we want to analyze how formal systems can b e used for linguistic theories we must start by deter

mining what sort of elementary structures the formalism provides and how these elementary structures

are combined using the combining operations dened in the formalism We will illustrate these notions

with some examples First consider CFGs In a CFG a grammar consists of a set of rewrite rules

which asso ciate a single nonterminal symbol with a string of terminal and nonterminal symbols Here

is a sample contextfree grammar

a S ! NP VP

b VP ! really VP

c VP ! V NP

d V ! likes

e NP ! John

f NP ! Lyn

Each of these rules is an elementary structure in this grammar We combine these elementary structures

by using one rule to rewrite a symbol introduced by another rule For example when we use rule a

we introduce the nonterminal symbols or no des NP and VP We may rewrite the VP no de by

using rule b or c This grammar generates among others the following string

John really likes Lyn

Derivations in CFGs can b e represented as trees for each nonterminal no de in the tree the daughters

record which rule was used to rewrite it The phrasestructure tree that corresp onds to sentence is given in Figure

NP VP

John really VP

V NP

likes Lyn

Figure Phrase Structure Tree for John really likes Lyn

Now consider a dierent type of mathematical formalism Tree Substitution Grammars TSG In a

TSG the elementary structures are phrasestructure trees A sample grammar is given in Figure It

consists of three trees one of which is ro oted in S and two of which are ro oted in NP Note that even

though from the p oint of view of a CFG a tree is a derived ob ject not an elementary one we have

dened TSGs in such a way that a tree is now an elementary ob ject of the grammar

We combine elementary structures in a TSG by using the op eration of substitution illustrated schemat

ically in Figure We can substitute tree into tree if there is a nonterminal symbol on the frontier α1 α 2α 3 S NP NP

NP VP John Lyn

V NP

likes

Figure A sample TSG

of which has the same lab el as the ro ot no de of A in Figure We can then simply app end

to at that no de No des at which substitution is p ossible are called substitution no des and

are marked with downarrows # A derivation in our sample TSG is shown in Figure The trees

representing the two arguments of the verb like John and Lyn are substituted into the tree

2 3

asso ciated with the verb yielding the wellformed tree from which the sentence John likes

1 4

Lyn can b e read o S A S => A A

α β γ

Figure The Substitution Op eration

Finally compare the DG used for MTT to CFG and TSG In DG the elementary structures are simply

no des lab eled with terminal symbols ie lexical items There are no nonterminal symbols No des are

comp osed by establishing dep endency relations b etween them The result is a dep endency tree

CFGs and TSGs are weakly equivalent they generate the same languages However to a linguist

they lo ok very dierent A contextfree rule contains a phrasestructure no de and its daughters an

elementary tree in a TSG may b e of arbitrary height Put dierently we have increased the domain

of lo cality of the elementary structures of the grammar This increased domain of lo cality allows the

linguist to state linguistic relationships such as sub categorization case assignment and agreement

dierently in a TSG As an example take agreement b etween sub ject and verb in English The linguist

working in TSG can simply state by using some featurebased notation that the verb and the NP

in sub ject p osition in tree of Figure agree with resp ect to a given set of features The linguist

working in CFG has a harder time since the verb is in rule d while the sub ject NP is in rule

a he cannot simply state the relation directly since it is imp ossible to state constraints that relate

no des in dierent elementary structures Instead the linguist must prop ose that the NP in fact agrees

with the VP in rule a and that the VP agreement features are inherited by its head in rule d

The notion that the VP and not only the verb agrees with the sub ject is a meaningful linguistic

prop osition and in fact the TSG linguist could have adopted it as well However the crucial issue α 1 α 4 S S

NP VP NP VP

V NP John V NP

likes likes Lyn

α 2 α 3 NP NP

John Lyn

Figure Substitution of arguments into initial tree of likes

is that the CFG linguist b ecause of his choice of formalism was forced to adopt it while the TSG

linguist may choose to do so or not on indep endent grounds

Now let us turn to our central concern the role of the lexicon We will call a grammar lexicalized if

every elementary structure is asso ciated with exactly one lexical item and if every lexical item of the

language is asso ciated with a nite set of elementary structures in the grammar Clearly dep endency

grammars including MTT are naturally lexicalized in this sense since the elementary structures

simply are the lexical items

The case is more complex for a CFG Consider the sample grammar given ab ove in We can see

that no lexical item is asso ciated with rules a and c therefore the grammar is not lexicalized It

would b e p ossible to combine rules a c and d into a single one

S ! NP likes NP

However it is now imp ossible to correctly place the adverb really since a lexicalized rule of the form

b is no longer useful the VP no de having b een eliminated The adverb cannot b e inserted b etween

the sub ject and the verb

There is a second way of lexicalizing a CFG instead of merging the two phrasestructure rules into a

single rewrite rule we can combine them and consider the result a fragment of a phrase structure

tree an elementary structure Put dierently we move from CFG to TSG For example tree in

Figure is the result of combining rules a c and d As desired tree is asso ciated with

exactly one lexical item the verb likes Thus we have now obtained a TSG from a CFG We can

derive the sentence John likes Lyn as shown previously in Figure

It turns out that a TSG is not really what we want either we are again faced with the problem

of getting the adverb in the right place since there is no no de into which to substitute it This

problem is solved by the tree comp osition op eration of adjunction introduced in the framework of

Tree Adjoining Grammars TAG Adjunction is shown in Figure Tree called an initial tree

Having two verbs like one of which also sub categorizes for an adverb do es not solve the problem since it do es not

generalize to multiple adverbs in addition to b eing linguistically unapp ealing

contains a nonterminal no de lab eled A the ro ot no de of tree an auxiliary tree is also lab eled

A as is exactly one nonterminal no de on its frontier the fo ot no de All other frontier no des are

terminal no des or substitution no des We take tree and remove the subtree ro oted at its no de A

insert in its stead tree and then add at the fo otno de of the subtree of that we removed earlier

The result is tree As we can see adjunction can have the eect of inserting one tree into the center

of another Our linguistic example is continued in Figure Tree containing the adverb is adjoined

at the VP no de into tree The result is tree which corresp onds to sentence Note that is

4 5 5

comp osed of trees and each of which corresp ond to exactly one lexical item in contrast

1 2 3 1

to the grammar given ab ove in

S A S

A => A A A

α β γ

Figure The Adjunction Op eration

α β α 4 1 5 S VP S

NP VP really VP NP VP

John V NP John really VP

likes Lyn V NP

likes Lyn

Figure Adjunction of really into initial tree

A formalism in which the elementary structures of a grammar are trees and in which the combining

op erations are adjunction and substitution is called a TAG Schabes has shown that a tree

comp osition system is only lexicalizable if the comp osition op erations include adjunction Thus the

pro cess of lexicalizing a CFG naturally leads to a TAG TAGs are more p owerful formally than CFGs

meaning that they can derive more complex languages than CFG They are also more dicult to parse

Several other prop osals have b een made to adapt phrasestructure grammars to a lexical approach

including categorial formalisms such as CCG Steedman and nontransformational phrase

structure grammars such as LFG Bresnan and Kaplan and HPSG Pollard and Sag

Interestingly the underlying formalisms of these frameworks are also more p owerful than CFG For a

summary of mathematical and computational prop erties of TAGs and some related phrasestructure

formalisms see Joshi et al

Like a CFG a TAG derives a phrasestructure tree called the derived tree The derived tree

for our example is the right tree in Figure In addition to the derived phrasestructure tree a

second structure is built up the derivation tree In this structure each of the elementary trees is

represented by a single no de Since the grammar is lexicalized we can identify this no de with the base

form of the lexeme of the corresp onding tree If a tree t is substituted or adjoined into a tree t

1 2

then the no de representing t b ecomes a dep endent of the no de representing t in the derivation tree

1 2

Furthermore the arcs b etween no des are annotated with the p osition in the target tree at which

substitution or adjunction takes place In the TAG literature this annotation is in the form of the

tree address of the no de using a formal notation to uniquely identify no des in trees without reference

to linguistic concepts However in analogy to the MTT notation we can simply assign numbers

to argument p ositions and introduce the convention that all other p ositions are attribute p ositions

marked as ATTR The derivation tree for the example derivation ab ove is shown in Figure We

can see that the derivation structure is a dep endency tree which closely resembles the DeepSyntactic

Representation DSyntR of MTT

1 2 ATTR

John Lyn really

Figure Derivation Tree for John really likes Lyn

The resemblance b etween the derivation structure and the DSyntR is not a coincidence It is a direct

result of lexicalization We would like to summarize some striking similarities b etween an MTM of a

language and a TAG grammar for that language

As in the case of an MTM a grammar in the TAG formalism consists of a lexicon whose

elementary structures are combined by some very simple rules of comp osition substitution and

adjunction in the case of TAG

The function words are included in the elementary structures of the lexemes that require them in

their sub categorization frame ie they are represented in the lexical entries for content words

not separately They are therefore not represented in the derivation structure just as they are

not represented in the DSyntR

A verb sub categorizes for its arguments there must b e exactly one constituent for each of its

obligatory arguments Adjuncts are not sub categorized for and there is no syntactic limit on

their number In MTT this is reected by the fact that there may an unbounded number of

ATTR subtrees while there is only one subtree for each of the numeric arc lab els In TAG this

distinction is captured by the fact that arguments are substituted a unique and obligatory step

while adjuncts are adjoined a recursive but optional step

In a TAG the lexicon consists of one tree family for each lexeme each tree family containing trees

for the syntactic variants of the lexeme activepassive voice whquestions for each argument

topicalization for each argument etc As in the case of MTT certain syntactic paraphrases

can b e handled by general rules metarules Becker Lexical functions and syntactic

paraphrases that use lexical functions have not yet b een introduced in the TAG framework but

they could b e integrated in a straightforward way

This is not exactly what is done in the TAG literature but the dierence is purely notational

Idioms phrasemes have b een discussed b oth within the TAG framework

Ab eilleand Schabes and in MTT Melcuk p Both frameworks can account

for idioms in a natural and similar way namely by p ostulating elementary structures that non

comp ositionally contain more than one lexeme

However it is imp ortant to note some imp ortant dierences b etween the two approaches

In TAG word order must b e determined at the same time as dep endency This pro cess cannot

b e separated into two steps as in MTT This means that the lexicon in a TAG grammar

for a sp ecic language must contain more syntactic information than a lexicon in the MTT

framework not only must it contain information ab out sub categorization and function words

the trees themselves must also contain enough information so that the word order comes out

right

While substitution of a tree t into tree t corresp onds to a dep endency of the lexemic element of

1 2

t on that of t this need not b e the case in adjunction We will see later examples in which t

1 2 1

is adjoined into t but the lexemic element of t dep ends on that of t Thus while adjunction

2 2 1

corresp onds to the establishment of a syntactic dep endency relation the direction of the relation

cannot b e determined from the direction of the adjunction alone

The similarities b etween MTT and a TAG approach b oth in the linguistic approach and in the resulting

representations allow us to use TAG as a way of relating MTT analyses to phrasestructurebased

analyses While much of the work on the interface b etween syntax and semantics on lexical functions

and on syntactic paraphrases in the MTT framework can b e reformulated in terms of a TAG analysis

we will concentrate in this pap er on applying insights from TAG analyses to the MTT framework

Formal Asp ects of Word Order Variation

In this section we will use the close relationship b etween lexicalized TAGs and DGs to make some ob

servations ab out nonpro jective constructions There are two p otential problems with nonpro jective

constructions in a dep endencybased theory

No parsing mo del for nonpro jective constructions is known that is computationally well

b ehaved

The syntax of nonpro jective constructions must b e expressed dierently from that of pro jective

constructions which is linguistically unmotivated

We discuss the rst p oint in more detail in Section Sections through discuss three illustrative

syntactic constructions We address the second p oint in Section where we present a prop osal for

handling certain nonpro jectivity within the MTT notion of syntagm

There have b een prop osals for formal variants of TAG in which the linear precedence of no des is stated indep endently

from immediate dominance see Joshi b Becker et al

We take the standard denition of pro jectivity as given in Melcuk pf which can b e shown to b e

equivalent to the denitions discussed in Marcus

Computational Prop erties of Dep endency Grammars

The principal reason for studying mathematical asp ects of the syntactic formalism used by a linguistic

theory is probably the need to explain the computational pro cesses involved in the generation and

understanding of language While it app ears that most syntactic constructions in most languages

are pro jective Melcuk and Pertsov p many languages do have syntactic constructions

often but not always pragmatically marked that are not It has b een shown that a fully pro jective

dep endency grammar is weakly equivalent to a CFG Gaifman where weak equivalence means

that for every DG there is a CFG that generates exactly the same set of sentences and vv The

equivalence of pro jective DGs and CFGs lets us transfer parsing results from CFGs to such grammars

In particular we know that we can parse a string in a CFG in at most O n time ie in an amount of

time prop ortional to the cub e of the length of the input string Though the parsing of nonpro jective

DGs has b een discussed see Covington and the references therein to our knowledge no formal

result has b een published There is reason to b elieve that in the worst case they can b e parsed in a

time prop ortional to an exp onential function of the length of the input string O If this worst

case actually o ccurred in natural language parsing then a DG would not b e a very app ealing candidate

for a mo del of human language pro cessing

Why is this a p otential problem for MTT Humans app ear to b e quite go o d at parsing ie constructing

a syntactic representation for a linear string of language If a linguistic theory wants to account for

this pro cess then it must b e able to provide an account of how the syntactic structures the theory

p ostulates can b e eectively and eciently constructed from the input Even if a linguistic theory do es

not aim at providing an account of human sentence pro cessing as in fact neither MTT nor GB do

then it must b e the case that such an account can in principle b e found since otherwise the relation

of the theory to observable b ehavior is unclear An account of human sentence pro cessing must b e

inherently computational While a mathematical study cannot of course provide a computational

theory of pro cessing it can provide useful guidelines for the elab oration of such a theory and thus

conrm the p ossibility of elab orating such a theory

Of course it could b e argued that nonpro jective constructions are in fact much more dicult to

pro cess than pro jective ones and that therefore the lack of a pro cessing account for nonpro jective

trees is actually welcome rather than a problem However data from psycholinguistic exp eriments

suggests that pro cessing diculty do es not pattern with the pro jectivenonpro jective distinction or

equivalently the distinction b etween CFGs and more p owerful formalisms For example Bach et al

show that the nonpro jective Dutch cross serial dep endencies which we discuss in Section

b elow are in fact easier to pro cess than German pro jective nested dep endencies Joshi gives

a TAGbased account of these dierences that crucially relies on the fact that b oth constructions can

b e handled by the same mathematical formalism

In the previous section we have argued that analyses using lexicalized TAG and dep endencybased

analyses b ear striking resemblances In these sections we will exploit these resemblances and discuss

two types of deep nonpro jectivity ie nonpro jectivity which aects lexical items that are already

present at the level of DSyntR We will argue that the nonpro jectivity caused by whwords in em

b edded English sentences and by the Dutch crossserial dep endencies can b e handled by a TAG Since

TAGs can b e parsed in O n time VijayShanker we can conclude that these types of non

pro jectivity are well b ehaved from a pro cessing p oint of view We then briey discuss a third type

of nonpro jectivity scrambling in German which cannot b e handled by a TAG either

Embedded whwords in English THINK 1 2

YOU CLAIM 1 2

MARY LIKE

2 1

WHO SARAH

Figure DSyntR for Sentence

Like in many other languages whwords in English questions generally must app ear in sentenceinitial

p osition This is also true in the case that the whword is an argument of an embedded verb Strikingly

there is no b ound on the depth of the embedding

Who do you think that Mary claimed that Sarah liked

In the whword is an argument of the most deeply embedded verb like thus causing the non

pro jectivity as can b e seen in Figure A TAG can capture the longdistance dep endency naturally

since the recursive adjunction op eration allows an unbounded number of clauses to intervene b etween

directly dep endent lexemes An analysis of whmovement in the TAG framework has b een prop osed

by Kro ch our analysis Figure is a slight variation of his analysis The steps are shown

in Figure We rst substitute all nominal arguments into their resp ective verbal trees and then

adjoin the intermediate claimclause into the most deeply embedded likeclause at the S

2 1

no de immediately domainted by the ro ot This has the eect of separating the whword from its

verb even though they originated in the same structure We then subsequently adjoin the matrix

thinkclause into the intermediate claims clause

The derivation leads to two structures the derivation tree in Figure and the derived tree in

Figure The derivation structure records the sequence of adjunctions and substitutions that leads

to the derived tree while the derived tree in Figure shows the phrase structure and thus the word

order of the nal sentence These two structures exist in parallel we do not have to determine the

word order from the dep endencybased derivation tree as a separate step

The reader will observe that contrary to the example of sentence the derivation structure given

in Figure do es not corresp ond directly to the DSyntR the direction of adjunction b etween the verbs

more precisely the trees anchored in verbs do es not corresp ond to the direction of the dep endency

Why is this We have seen that nominal arguments are substituted into verbal trees and that adjuncts

are adjoined into trees they mo dify In b oth instances the derivation structure corresp onds to the

dep endency structure However in the analysis for embedded clauses we have given here we adjoin

the matrix clause into the dep endent clause at its S no de This is indicated in the derivation tree

Figure by annotating the arcs with an S rather than with an MTTstyle annotation This

dierence however do es not aect the p oint we would like to make in this pap er what is central to

this exp osition is that a derivation in a TAG is like a dep endency analysis in that it establishes direct

relation b etween lexical items The direction of adjunctions need not corresp ond to the direction of

the dep endency as long as the latter can b e retrieved from the former by some linguistically motivated

simple pro cedure For example in our case the actual dep endency structure can b e derived trivially

For many constructions the exact dep endency analysis is often a matter of discussion However in the case at

hand the issue is quite uncontroversial β 1 β 2 α1

S S S

NP VP NP VP NP S

V S V S NP VP

think Comp S claims Comp S V NP

that that likes ε

α2 α3 α4 α5 NP NP NP NP

you Mary who Sarah

Figure TAG derivation for Sentence

1 S 2

who claim Sarah

1 S

Mary think

you

Figure TAG derivation tree for Sentence S

NP S

who NP VP

you V S

think Comp S

that NP VP

Mary V S

claims Comp S

that NP VP

Sarah V NP

likes ε

Figure TAG derived tree for Sentence

arcs marked S are simply inverted

We may draw two conclusions First since the construction can b e represented by a TAG we can

parse this type of nonpro jectivity in O n time Second we can state the wordorder rules lo cally in

the tree asso ciated with one clause in tree in Figure the whword has b een moved to the front

of the clause This lo cal op eration b ecomes a nonpro jective one through adjunction In Section

we will prop ose a way of implementing this lo cality of wordorder rules in the MTT framework

Embedded Clauses in Dutch

As in German embedded clauses in Dutch can o ccur b efore the clausenal verb in a center

embedding construction However the order of the verbs in the two languages dier while in

German the dep endencies b etween the verbs and their arguments are nested they are crossserial in

Dutch Consider the following sentence

omdat Wim Jan Marie de kinderen zag help en leren zwemmen

b ecause Wim Jan Marie the children saw to help to teach to swim

b ecause Wim saw Jan help Marie teach the children to swim

We would like to thank Hotze Rullmann and Marc Verhagen for helping us with this example OMDAT

ZIEN

1 2

WIM HELPEN

21 3 2 JAN MARIE LEREN 2 3 1

DE KINDEREN ZWEMMEN (MARIE) 1

(DE KINDEREN)

Figure DSyntR for Sentence

This construction is one of the wellknown nonpro jective constructions see eg Melcuk

9 10

p as can b e seen in Figure Our TAG analysis in Figure is based on that prop osed

in Joshi a Kro ch and Santorini The main verb of each clause is raised an analysis

prop osed indep endently of the TAG analysis in the GB literature We then adjoin each clause into its

immediately dep endent clause at the S no de immediately dominated by the ro ot no de This pushes

b oth verbs away from their nominal arguments even though they originate in the same elementary

structure The order of the verbs in the nal sentence simply follows from the way the elementary

structures are adjoined no global wordorder rules are necessary

We again conclude that this type of nonpro jectivity can b e parsed in O n time and that the

wordorder rules can b e expressed as lo cal constraints on clausesized structures

Scrambling in German

We see that in the cases of whelements in embedded clauses in English and of Dutch crossserial

dep endencies TAGs can provide an account We avoid an exp onential explosion in computing time

for the parsing problem However it app ears that there are constructions in natural language that

surpass even the additional p ower of TAGs One such construction derives from the free word order

allowed in verbnal languages such as German Japanese and Hindi In German more than one

actant from an embedded clause may b e ordered among the actants of the matrix clause We will

refer to these actants as scrambled In the matrix clause the scrambled embedded constituents may

o ccupy any p osition An example is given in sentences a and b b elow

a da der Detektiv niemandem PRO den Verdachtigen

i i

that the detective nom noone dat the susp ect acc

In this and other DSyntRs actants that are deleted at subsequent stages ie at SSyntR are represented in

parentheses Furthermore in DSyntR trees we follow the common practice of lab eling no des with the innitives of

verbs while for TAG trees we will lab el no des with the fully inercted form

The nonpro jectivity of the construction is indep endent of the particular analysis chosen for control verbs of

various types S S

S VVS

NP VP zag NP VP helpen

S V NP S V

ε ε NP NP NP

Wim Jan Marie

S S

S V S V

NP VP leren NP VP zwemmen

PRO NP S V PRO V

ε ε NP

de kinderen

Figure TAG Derivation for Sentence

des Verbrechens zu uberf uhren verspricht

the crime gen to indict promises

that the detective has promised noone to indict the susp ect of the crime

b da des Verbrechens der Detektiv den Verdachtigen

k i j

that the crime gen the detective nom the susp ect acc

niemandem PRO t t zu uberf uhren verspricht

i j k

noone dat to indict promises

that the detective has promised noone to indict the susp ect of the crime

In this particular sentence we have a mixture of the crossserial dep endencies of Dutch and of the

nested dep endencies of standard German Such constructions p ose sp ecial problems even for TAGs

In Becker et al we provide a discussion of the issue and show that a more p owerful extension

of TAG MultiComp onent TAG MCTAG can handle all cases In MCTAGs rst introduced by

Weir the elementary tree is split up into parts which are group ed together into sets All trees

from one set must b e adjoined at the same time The derivation of the example sentence is shown

in Figure with the result in Figure The matrix clause is represented by a tree set containing

two trees These trees are adjoined at dierent no des into the single tree representing the embedded

clause

While we have an analysis using MCTAG this do es not help us with the complexity of the parsing

problem the parsing problem for the relevant version of MCTAG called nonlo cal MCTAG is

In the case of one overt nominal argument p er clause TAGs can handle sentences involving or fewer embedded

clauses DASS

VERSPRECHEN 1 3 2 DER DETEKTIVNIEMAND UEBERFUEHREN

2 3 1

DER VERDAECHTIGE DAS VERBRECHEN (DER DETEKTIV)

Figure DSyntR of Sentence b

α S

2 NP2 S β β S 1 2 S 1 NP2 S NP1 VP {}, 1 1 NP1 S NP VP ε 2 NP1 SV1

PRO NP NP V2

ε ε

Figure Deriving the Scrambled Sentence b

known to b e NPcomplete see Rambow and Satta for a summary of relevant mathematical and

computational prop erties which means that with high likelihoo d it is exp onential and thus no b etter

than the general parsing problem for nonpro jective DGs Scrambling remains an op en problem from

the pro cessing p oint of view

Lo calizing Syntactic Rules

As discussed in Melcuk and Pertsov p wordorder rules for unbounded nonpro jective

constructions cannot b e stated as syntagms ie as lo cal rules aecting two no des linked by a

dep endency relation or as conditions on syntagms Here unbounded nonpro jectivity means that

there is no limit on the number of lexical items simultaneously in violation of pro jectivity Instead

they must b e stated in separate global rules The existence of two types of wordorder rules is not

fully satisfactory it is motivated not by any linguistic considerations but only by the mathematical

prop erties of the underlying dep endency formalism and it contradicts the spirit of MelcuksPrinciple

of Maximal Lo calization Melcuk p

As we have seen in the case of whwords and Dutch embedded clauses the TAG approach lets us

lo calize the wordorder rules within the elementary structure of a clause or more precisely of a verb

just as say the SOV order is lo calized in elementary trees How can we transfer this approach to S

2 NP2 S

β 1 1 NP1 S

1 NP2 S

1 NP1 VP

β ε 2 2 NP1 S V

NP VP

PRO 1 2 NP2 NP2 V2

ε ε

Figure The Derived Tree for Sentence b

MTT One way of lo calizing all wordorder rules would b e to asso ciate phrasestructure trees with

no des of the dep endency tree However a simpler solution and obviously more in the spirit of MTT

would b e to asso ciate pairs of strings ie the DMorphR or linearized sequences of no des rather than

just single strings with no des of the dep endency tree This approach is inspired by a generalization

of CFG called Head Grammar HG Pollard which has b een shown to b e formally equivalent

to TAG Weir et al Basically a HG provides a dep endency tree and rules how to compute the

nal string or yield This string is computed b ottomup with each no de is asso ciated a list of two

string segments As we go up the dep endency tree we compute the yield for each new no de based on

the yields of its daughter no des The segments can b e shifted around according to certain rules and

new terminal symbols added but the segments may not b e broken up We see that this is exactly how

the SurfaceSyntactic Rules SSyntR of MTT op erate see eg Melcuk except that in the

case of HG there are two strings instead of one

Our prop osal can b est b e illustrated by giving two SSynt rules hop efully in the spirit of Melcuk

and Pertsov in which we use twopart strings to deal with the Dutch crossserial dep enden

cies Figure A dep endency relation is now linearized not as one string but as two which are

represented as a pair separated by commas eg Y Y Syntagm takes care of the most deeply

embedded clause the verb X is put in the second segment while all the overt nominal arguments are

put in the rst segment Syntagm applies when the most deeply embedded verb has no dep endents

at all Syntagm applies to verbs that sub categorize for clauses The DMorphR asso ciated with the

embedded clause Y is in two segments called Y and Y The governing verb of Y X is added to

the left of Y Any nominal arguments the sub ject or ob ject of X are added to the left of Y As

an example consider Dutch sentence discussed previously We give the SSyntR in Figure

The noun phrase ro oted in kinderen is of course linearized as de kinderen The clause ro oted in in

zwemmen has no dep endents verbal or nominal Therefore syntagm applies and we get a DMor

The exact details i particular the arc lab els are not of interest here We also omit all features in the b oth the

syntactic and morphological representations 1. X a verb If X W(N) ,

(i-1)th completive <=> Y1, X+Y2 a = 1st compl or a=pred, then W+...+Y1 Y verb-inf

2. X verb If a =ith compl or a =pred

a <=> Y, X and there is no Y verb-inf such that X Y Y noun

X verb <=> --, X If X has no dependents at all

Figure Syntagm for Dutch Embedded Clauses

OMDAT subordinate-conjunctional

ZIEN

predic 1st compl

WIM HELPEN predic 2nd compl 1st compl JAN MARIE LEREN 2nd compl 3rd compl

KINDEREN ZWEMMEN determ

Figure SSyntR for Sentence

phR consisting of two strings and zwemmen We then need to linearize the clause ro oted in leren

Since leren do es have a verbal dep endent namely zwemmen syntagm applies We have Y

the empty string and Y zwemmen The verb leren is added to the left of Y Furthermore the

condition in syntagm sp ecies that the nominal arguments of leren must precede Y We therefore

obtain

DMorphR for subtree ro oted in zwemmen

de kinderen leren zwemmen

Now consider the subtree ro oted in helpen Again syntagm applies this time with Y de

kinderen and Y leren zwemmen Again the head verb is added to the left of Y while its

nominal arguments are added to the left of Y We obtain

DMorphR for subtree ro oted in helpen

Jan Marie de kinderen help en leren zwemmen

Finally we apply syntagm one more time for verb zien and then the syntagm for the subordinate

conjunctional SSyntRel This latter syntagm not given here will app end the two parts of the

DMorphR of its dep endent zien no de and app end the omdat no de giving us the desired result

DMorphR for subtree ro oted in omdat

omdat Wim Jan Marie de kinderen zien help en leren zwemmen

Thus we do not need to have recourse to global rules the wordorder of the sentence is xed in

syntagms despite the existence of unbounded deep nonpro jectivity We can deal with embedded

whwords in a similar manner for space limitations we refrain from giving the details here

Note that our prop osal do es not replace the notion of syntagm as dened in the Surface Syntactic

Comp onent of MTT Instead it extends it and it do es so only in those cases where nonpro jectivity

o ccurs the other syntagms need not b e changed What is replaced is the notion of global ordering

rules to handle cases such as English whmovement and Dutch verb raising

Conclusion

In this pap er we have argued that the crucial dierence b etween a CFGbased analysis and a DG

based analysis is that the latter but not the former can b e lexiconbased We have describ ed TAG a

phrasestructure grammar which can b e lexicalized and we have shown some similarities in linguistic

analyses expressed in DGs and in TAGs In considering nonpro jective word order phenomena we

have shown that two imp ortant results can b e transferred form the TAG analysis to the MTT analysis

rst we can give upp er b ounds on pro cessing complexity for sp ecic constructions second we do not

need to have two types of wordorder rules syntagmbased rules and global rules Instead if we extend

the denition of a syntagm all rules can b e expressed lo cally

Bibliography

Ab eille Anne and Schabes Yves Parsing idioms in tree adjoining grammars In Fourth

Conference of the European Chapter of the Association for Computational Linguistics EACL

Manchester

Bach E Brown C and MarslenWilson W Crossed and nested dep endencies in German

and Dutch A psycholinguistic study Language and Cognitive Processes

Becker Tilman Metarules on tree adjoining grammars In Proceedings of the st International

Workshop on Tree Adjoining Grammars Schlo Dagstuhl

Becker Tilman Joshi Aravind and Rambow Owen Long distance scrambling and tree adjoin

ing grammars In Fifth Conference of the European Chapter of the Association for Computational

Linguistics EACL pages ACL

Bresnan J and Kaplan R Lexicalfunctional grammar A formal system for grammatical

representation In Bresnan J editor The Mental Representation of Grammatical Relations MIT

Press

Covington Michael Parsing discontinuous constituents in dep endency grammar Computa

tional Linguistics

Gaifman Haim Dep endency systems and phrasestructure systems Information and Control

Hays David G Dep endency theory A formalism and some observations Language

Joshi Aravind Levy Leon and Takahashi M Tree adjunct grammars J Comput Syst Sci

Joshi Aravind K a An introduction to Tree Adjoining Grammars In ManasterRamer A

editor Mathematics of Language pages John Benjamins Amsterdam

Joshi Aravind K b Wordorder variation in natural language generation Technical rep ort

Department of Computer and Information Science University of Pennsylvania

Joshi Aravind K Pro cessing crossed and nested dep endencies an automaton p ersp ective on

the psycholinguistic results Language and Cognitive Processes

Joshi Aravind K and Schabes Yves Treeadjoining grammars and lexicalized grammars In

Nivat Maurice and Podelski Andreas editors Denability and Recognizability of Sets of Trees

Elsevier

Joshi Aravind K VijayShanker K and Weir David The convergence of mildly context

sensitive grammatical formalisms In Sells P Shieb er S and Wasow T editors Foundational

Issues in Natural Language Processing pages MIT Press Cambridge Mass

Kro ch A Sub jacency in a tree adjoining grammar In ManasterRamer A editor Mathe

matics of Language pages John Benjamins Amsterdam

Kro ch Anthony and Santorini Beatrice The derived constituent structure of the West Ger

manic Verb Raising construction In Freidin R editor Principles and parameters in comparative

grammar pages MIT Press Cambridge Mass

Kunze J urgen Die Auslabarkeit von Satzteilen bei koordinativen Verbindungen im Deutschen

AkademieVerlag Berlin

Marcus Solomon Sur la notion de pro jectivite Zeitschr f math Logik und Grund lagen d

Math

MelcukIgor A Ordre des mots en synthese automatique des textes russes T A Informa

tions

Melcuk Igor A Dependency Syntax Theory and Practice State University of New York

Press New York

MelcukIgor A and Pertsov Nikola j V Surface Syntax of English John Benjamins Ams

terdamPhiladelphia

Melcuk Igor A and Polguere Alain A formal lexicon Computational Linguistics

Nichols Johanna The meeting of East and West confrontation and convergence in contem

p orary linguistics In Proceedings of th Fifth Meeting of the Berkeley Lingsuitics Society

Pollard Carl Generalized phrase structure grammars head grammars and natural language

PhD thesis Stanford University Stanford CA

Pollard Carl and Sag Ivan InformationBased Syntax and Semantics Vol Fundamentals

CSLI

Rambow Owen and Satta Giorgio Formal prop erties of nonlo cality Paper Presented at the

TAG Workshop

Schabes Yves Computational and mathematical studies of lexicalized grammars Technical

rep ort Department of Computer and Information Science Department of Computer and Infor

mation Science University of Pennsylvania

Sgall Petr Valency and underlying structure In this volume

Steedman Mark Structure and Intonation Language

VijayShanker K A study of Tree Adjoining Grammars PhD thesis Department of Computer

and Information Science University of Pennsylvania Philadelphia PA

Weir David VijayShanker K and Joshi Aravind The relationship b etween tree adjoining

grammars and head grammars In th Meeting of the Association for Computational Linguistics

ACL New York

Weir David J Characterizing Mild ly ContextSensitiv e Grammar Formalisms PhD thesis

Department of Computer and Information Science University of Pennsylvania