<<

philosophies

Article Lexicalised Locality: Local Domains and Non-Local Dependencies in a Lexicalised Tree Adjoining Grammar

Diego Gabriel Krivochen 1,* and Andrea Padovan 2

1 Dipartimento di Culture e Civiltà, Università di Verona, 37129 Verona, Italy 2 Dipartimento di Lingue e Letterature Straniere, Università di Verona, 37129 Verona, Italy; [email protected] * Correspondence: [email protected]

Abstract: Contemporary assumes that syntactic structure is best described in terms of sets, and that locality conditions, as well as cross-linguistic variation, is determined at the level of designated functional heads. Syntactic operations (merge, MERGE, etc.) build a structure by deriving sets from lexical atoms and recursively (and monotonically) yielding sets of sets. Additional restrictions over the format of structural descriptions limit the number of elements involved in each operation to two at each derivational step, a and a non-head. In this paper, we will explore an alternative direction for minimalist inquiry based on previous work, e.g., Frank (2002, 2006), albeit under novel assumptions. We propose a view of syntactic structure as a specification of relations in graphs, which correspond to the extended projection of lexical heads; these are elementary trees in Tree Adjoining Grammars. We present empirical motivation for a lexicalised approach to structure building, where the units of the grammar are elementary trees. Our proposal will be based on  cross-linguistic evidence; we will consider the structure of elementary trees in Spanish, English and  German. We will also explore the consequences of assuming that nodes in elementary trees are Citation: Krivochen, D.G.; Padovan, addresses for purposes of tree composition operations, substitution and adjunction. A. Lexicalised Locality: Local Domains and Non-Local Keywords: locality; tree adjoining grammar; lexicalised grammar; tree composition Dependencies in a Lexicalised Tree Adjoining Grammar. Philosophies 2021, 6, 70. https://doi.org/ 10.3390/philosophies6030070 1. Introduction The definition of local domains for the application of syntactic rules, as well as rules Academic Editor: Peter Kosta of semantic interpretation, has been a central topic in generative grammar since the first definitions of a transformational cycle [1–3]. In the early days of generative grammar, cycles Received: 15 July 2021 were delimited by the presence of two designated nodes, S and NP, as the only categories Accepted: 4 August 2021 Published: 18 August 2021 that could take subjects. From a contemporary perspective, this definition implies a certain heterogeneity. Whereas NP is the projection of a lexical category, S does not fit in the

Publisher’s Note: MDPI stays neutral uniformly endocentric template that has been the norm since the mid-1980s, and thus with regard to jurisdictional claims in is neither a “lexical” nor a “functional” category in a strict sense. The mixed nature of published maps and institutional affil- bounding nodes remained in the Government and (GB) framework, where barriers iations. were nodes defining opaque domains for the application of syntactic operations [4]. For example, CP was a barrier for movement from VP (not L-marked, dominated the bounding category inflexion), as were VPs (not L-marked) and adjuncts (similarly, not L-marked). Note that C is a functional head, but V is not (adjuncts may be PPs or AdvPs, also lexical categories). The definition of locality and the distinction between lexical and functional Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. categories remained relatively orthogonal to each other. This article is an open access article In the late 1980s functional categories started playing a much more crucial role in distributed under the terms and the syntactic derivation (in particular after the layered analysis of the VP in [5] and the conditions of the Creative Commons “explosion” of the node Infl(ection) in [6]), and the research on locality became centred on Attribution (CC BY) license (https:// functional heads and their projections. With the advent of the phase-theoretic framework, creativecommons.org/licenses/by/ in [7] and much subsequent work, this shift towards functional categories as delimiters 4.0/). of local structure became even more obvious. Syntactic domains are delimited by the

Philosophies 2021, 6, 70. https://doi.org/10.3390/philosophies6030070 https://www.mdpi.com/journal/philosophies Philosophies 2021, 6, x FOR PEER REVIEW 2 of 33

[7] and much subsequent work, this shift towards functional categories as delimiters of Philosophies 2021local, 6, 70structure became even more obvious. Syntactic domains are delimited by the pres- 2 of 29 ence of functional categories v* and C, which defines the size of syntactic chunks where operations of feature checking/valuation and movement/internal merge take place. How- ever, the consequencespresence of this of shift functional are far-reaching. categories Thev * focus and C, on which functional defines categories the size as of syntactic chunks the skeleton of the whereclause operations(and the main of feature locus checking/valuationof cross-linguistic variation) and movement/internal has found its merge take place. maximal expressionHowever, in the cartographic the consequences enterprise of this as shiftwell areas nanosyntax far-reaching. (where The focus features on functional categories configure a finer-grainedas the skeletonhierarchy of at the a clausesmaller (and level the of mainorganisation). locus of cross-linguistic Because the order variation) has found its of projections in themaximal phrasal skeleton expression is fixed in the (grossly cartographic speaking, enterprise the phrasal as well skeleton as nanosyntax is an (where features ordered array C > Tconfigure > V, where a finer-grained C, T and V each hierarchy stands at for a smaller a possibly level highly of organisation). articulated Because the order sequence of functionalof projections heads, as in in [8 the–11] phrasal), the occurrence skeleton is fixedof phase (grossly heads speaking, and non- thephase phrasal skeleton is an heads follows a rhythmordered determined array C >a Tpriori > V, [12 where–14] C,(p. T56). and V each stands for a possibly highly articulated In this paper, wesequence will argue of functional for a different heads, approach as in [8– 11to]), the the locality occurrence in , of phase which heads and non-phase follows the lead of headslexicalised follows tree a rhythmadjoining determined grammarsa (TAGs). priori [12 This–14] (p.entails 56). two depar- tures from contemporaryIn m thisinimalist paper, practice, we will arguewhich for will a differentbe empirically approach motivated. to thelocality First, in syntax, which that the grammar buildsfollows and the operates lead of lexicalised over graphs, tree not adjoining sets. Second, grammars that (TAGs). locality This condi- entails two departures tions defined in structuralfrom contemporary representations minimalist are intimately practice, which related will to be the empirically distinction motivated. be- First, that the tween lexical and functionalgrammar buildsmaterial, and such operates that overlexical graphs, heads not define sets. Second,local interpretative that locality conditions defined domains. We will providein structural arguments representations in favour of are a model intimately of the related grammar to the where distinction the size between lexical and of syntactic domainsfunctional is neither material, defined sucha priori that nor lexical is it determined heads define by local the interpretativepresence of a domains. We will specially designatedprovide functional arguments head. Rather, in favour cycles of a modelare defined of the as grammar the extended where projections the size of syntactic domains of lexical heads. In orderis neither to do definedso, we needa priori to presentnor is itsome determined preliminary by the definitions presence of of lexi- a specially designated calised TAGs. functional head. Rather, cycles are defined as the extended projections of lexical heads. In order to do so, we need to present some preliminary definitions of lexicalised TAGs. 2. Main Tenets of the TAG 2. Main Tenets of the TAG Formalism A TAG is essentially a rewriting system working on predefined elementary trees that A TAG is essentially a rewriting system working on predefined elementary trees that can be augmented and combined with one another at the frontier or “counter-cyclically”; can be augmented and combined with one another at the frontier or “counter-cyclically”; these two cases correspond to the (generalised) operations substitution [15,16] and adjunc- these two cases correspond to the (generalised) operations substitution [15,16] and adjunc- tion [17,18], respectively. Trees may be either elementary or derived: the latter are obtained tion [17,18], respectively. Trees may be either elementary or derived: the latter are obtained by by means of applyingmeans composition of applying operations composition to the operations former. In to turn, the former.two kinds In of turn, elemen- two kinds of elementary tary trees are defined:trees initial are defined: trees andinitial auxiliary trees and treesauxiliary. Initial trees. Initialare single trees-rooted are single-rooted struc- structures that tures that contain acontain non-terminal a non-terminal node that node is thatidentical is identical to the toroot the of root an of auxiliary an auxiliary tree. tree. Auxiliary trees Auxiliary trees are alsoare alsosingle single-rooted-rooted structures, structures, which which contain contain a node a node in their in theirfrontier frontier that that is identical to is identical to their theirroot: root:this allows this allows for auxiliary for auxiliary trees treesto be toadjoined be adjoined to initial to initial trees treesat the at the non-terminal non-terminal that correspondsthat corresponds to the to root the of root the of auxiliary the auxiliary tree. tree. The root node of anThe initial root tree node IT γ of is an labe initiallled tree“S”, ITandγ isits labelled “frontier “S”,” is andconstituted its “frontier” by is constituted by terminal nodes. Intermediateterminal nodes. nodes Intermediate (transitively nodesdominated (transitively by S) are dominated non-terminals by S), areas non-terminals, as seen in Figure 1. seen in Figure1.

Figure 1. Initial tree. Figure 1. Initial tree. The root note of an auxiliary tree AT β is labelled “X”, and its frontier contains at least The root note aof node an auxiliary with the tree same AT label β is as labelled the root “ [X19”,] and (p. 18). its frontier Intermediate contains nodes at are non-terminals. least a node with theThis same is diagrammed label as the root in Figure [19] (p.2. 18). Intermediate nodes are non-termi- nals. This is diagrammedThe in Figure crucial 2. aspect of TAGs is the possibility of constructing derived trees from these elementary units under well-defined conditions. Let us define and illustrate the two operations that give us tree composition in a TAG.

Philosophies 2021, 6, x FOR PEER REVIEW 3 of 31

Philosophies 2021, 6, x FOR PEER REVIEWPhilosophies 2021, 6, 70 3 of 33 3 of 29

Figure 2. Auxiliary tree.

The crucial aspect of TAGs is the possibility of constructing derived trees from these elementary units under well-defined conditions. Let us define and illustrate the two op- erationsFigure 2. thatAuxiliary give tree.us tree composition in a TAG. Figure 2. Auxiliary tree. • Substitution: let T be a tree with root S and T’ a tree with root S’ and a node labelled • SSubstitution in its frontier.: let Substitution T be a tree withinserts root T Sinto and the T’ frontier a tree with of T’ root at S S’. I andn other a node words, labelled this The crucial aspect of TAGs operationSis inthe its possibility frontier. can be Substitution conceivedof constructing of inserts as thederived T rewriting into trees the from frontier of a thesenode of T’at atthe S. frontier In other of words, a piece this of elementary units under well-definedstructoperation conditions.ure as can another be Let conceived uspiece define of of structure. and as the illustrate rewriting In this the sense, of two a node op- substitution at the frontier is a tail of- arecursive piece of erations that give us tree compositionprocedurestructure in a asTAG. Substitution another. piece is diagrammed of structure. in In Figure this sense, 3. substitution is a tail-recursive procedure. Substitution is diagrammed in Figure3. • Substitution: let T be a tree with root S and T’ a tree with root S’ and a node labelled S in its frontier. Substitution inserts T into the frontier of T’ at S. In other words, this Substitution of S operation can be conceived of as the rewriting of a node at the frontier of a piece of structure as another piece of structure. In this sense, substitution is ina tail T’ by-recursive T procedure. Substitution is diagrammed in Figure 3.

Substitution of S in T’ by T

Figure 3. SubstitutionSubstitution.. • auxiliary tree • Adjoining: in in this this case, case, a piece a piece of structure, of structure, the auxiliary the tree, is inserted, is inserted into a desig- into a designated node (the adjoining site) in the initial tree. In other words, the operation nated node (the adjoining site) in the initial tree. In other words, the operation of ad- of adjoining rewrites a node by “shoehorning” a piece of structure (i.e., the auxiliary joining rewrites a node by “shoehorning” a piece of structure (i.e., the auxiliary tree) tree) inside another structure (i.e., the initial tree). The auxiliary tree must have two inside another structure (i.e., the initial tree). The auxiliary tree must have two dis- distinguished nodes at its frontier (node Z in (Figure4)), namely the root node and the Figure 3. Substitution. tinguished nodes at its frontier (node Z in (Figure 4)), namely the root node and the foot node, which must carry the same label as each other, and which also must be the foot node, which must carry the same label as each other, and which also must be the Philosophies• Adjoining 2021, 6, x :FOR in this PEER case, REVIEW a piece same of labelstructure, as the the target auxilia forry adjunction tree, is inserted in the initial into a tree desig- (here, that label is Z). Note4 thatof 33 same label as the target for adjunction in the initial tree (here, that label is Z). Note that the foot node can be the daughter of the root node (as in the auxiliary tree (Figure4a) ) nated node (the adjoining sitethe) in foot the node initial can tree be. theIn otherdaughter words, of the the root operation node (as of in athed- auxiliary tree (Figure 4a)) or joining rewrites a node by “orshoehorning show up much” a piece lower of structure in the structure (i.e., the (as auxiliary in the auxiliary tree) tree (Figure4b)). show up much lower in the structure (as in the auxiliary tree (Figure 4b)). inside another structure (i.e., the initial tree). The auxiliary tree must have two dis- tinguished nodes at its frontier (node Z in (Figure 2)), namely the root node and the foot node , which must carry the same label as each other, and which also must be the same label as the target for adjunction in the initial tree (here, that label is Z). Note that the foot node can be the daughter of the root node (as in the auxiliary tree (Figure 2a)) or show up much lower in the structure (as in the auxiliary tree (Figure 2b)).

Initial Tree Auxiliary Trees

Figure 4. Initial and auxiliary trees. Auxiliary tree (a) features a recursive foot node immediately dominated byFigure the root; 4. Initial auxiliary and treeauxiliary (b) features trees. Auxiliary a foot node tree not (a) immediately features a dominatedrecursive foot by thenode root immediately. domi- nated by the root; auxiliary tree (b) features a foot node not immediately dominated by the root. Both (Figure4a,b) are admissible auxiliary trees (root and foot nodes are circled as in theBoth original (2a) notationand (2b) are in [ 18admissible] (p. 209)). auxiliary Adjunction trees would (root and insert foot an nodes auxiliary are circled tree (either as in the original notation in [18] (p. 209)). Adjunction would insert an auxiliary tree (either (2a) or (2b)) into the node Z in the initial tree, effectively pushing W and Y downwards. Sup- pose that we adjoin the auxiliary tree (2b) to the initial tree above. The resulting tree, re- ferred to as a derived tree, would be (Figure 2c).

c.

Figure 5. Derived tree.

A grammar featuring adjunction, as defined in [18] and related work, pushes the generative power of the grammar to mildly context-sensitive; it is worth pointing out that recent developments in so-called minimalist grammars [20] converge in mildly-context sensitive generative power. Crucially, in both cases (substitution and adjunction), we are asking of the grammar that it be capable of identifying nodes that carry the same label. In other words, the gram- mar must be sensitive to nodes assigned to the same indexed category, even in distinct local structural chunks. The consequences of requiring that the grammar be capable of identifying nodes that are in some sense (to be specified below) identical even in distinct structural contexts have been, in our opinion, underexplored. The second part of our pa- per will deal with this issue in detail.

Conditions Imposed on TAG Derivations The main question we want to address in this paper is how to define the size of ele- mentary trees: what does an elementary tree (minimally) contain? Note that this question, as we posed it, makes no assumption that the size of elementary trees is uniform: as we

Philosophies 2021, 6, x FOR PEER REVIEW 4 of 33

Initial Tree Auxiliary Trees

Figure 4. Initial and auxiliary trees. Auxiliary tree (a) features a recursive foot node immediately domi- nated by the root; auxiliary tree (b) features a foot node not immediately dominated by the root. Philosophies 2021, 6, 70 4 of 29 Both (2a) and (2b) are admissible auxiliary trees (root and foot nodes are circled as in the original notation in [18] (p. 209)). Adjunction would insert an auxiliary tree (either (2a) or (2b)) into the node Z in the initial tree, effectively pushing W and Y downwards. Sup- (Figure4a) or (Figure4b)) into the node Z in the initial tree, effectively pushing W and Y pose that we adjoin the auxiliary tree (2b) to the initial tree above. The resulting tree, re- downwards. Suppose that we adjoin the auxiliary tree (Figure4b) to the initial tree above. ferred to as a derived tree, would be (Figure 2c). The resulting tree, referred to as a derived tree, would be (Figure5).

c.

Figure 5. Derived tree. Figure 5.A Derived grammar tree. featuring adjunction, as defined in [18] and related work, pushes the generative power of the grammar to mildly context-sensitive; it is worth pointing out that recentA grammar developments featuring in so-calledadjunction minimalist, as defined grammars in [18] and [20 ]related converge work in, mildly-context pushes the generativesensitive power generative of the power. grammar to mildly context-sensitive; it is worth pointing out that recent developmentsCrucially, in both in so cases-called (substitution minimalist and grammars adjunction), [20] weconverge are asking in mildly of the-context grammar sensitivethat it generative be capable power. of identifying nodes that carry the same label. In other words, the grammarCrucially, must in beboth sensitive cases (substitution to nodes assigned and adjunction) to the same, indexedwe are asking category, of the even grammar in distinct thatlocal it be structural capable of chunks. identifying The nodes consequences that carry of the requiring same label that. I then other grammar words, be the capable gram- of maridentifying must be sensitive nodes that to arenodes in someassigned sense to (to the be same specified indexed below) category identical, even even in indisti distinctnct localstructural structural contexts chunks. have The been, consequences in our opinion, of requiring underexplored. that the The grammar second partbe capable of our paper of identifyingwill deal nodes with this that issue are in some detail. sense (to be specified below) identical even in distinct structural contexts have been, in our opinion, underexplored. The second part of our pa- perConditions will deal with Imposed this on issue TAG in Derivations detail. The main question we want to address in this paper is how to define the size of Conditionselementary Imposed trees: on TAG what D doeserivations an elementary tree (minimally) contain? Note that this question,The main as wequestion posed we it, makeswant to no address assumption in this that paper the sizeis how of elementary to define the trees size is of uniform: ele- mentaryas we trees: will insist, what does a lexicalised an elementary TAG gives tree (minimally) us the flexibility contain? to define Note that elementary this question, trees of as differentwe posed sizes. it, makes Further, no assumption if elementary that trees the are size defined of elementary in a lexicalised trees is grammar, uniform: thenas we it is entirely possible that different natural languages define trees of different sizes, depending on the properties of their lexical categories. This picture of cross-linguistic variation in terms of the size of elementary trees can be found in [21].

After introducing the building blocks of TAG, we can now expound on what has come to be known as the “fundamental TAG hypothesis” and the condition on elementary tree minimality (CTEM); a corollary on locality is also provided. • Fundamental TAG hypothesis: Every syntactic dependency is expressed locally within a single elementary tree [19,21,22] (p. 233). • Condition on Elementary Tree Minimality (CETM): the heads in an elementary tree must form part of the extended projection (in the sense of [23]) of a lexical head. As a result, the lexical array underlying an elementary tree must include a lexical head and may include any number of functional heads that are associated with that lexical head, see [19] (p. 151). In other words, elementary trees contain the nodes that form the extended projection of a single lexical head: this is what makes a TAG “lexicalised”. The size of elementary trees can be limited under an appropriately restricted formulation of the CETM: what exactly does it mean for a head to be “lexical”, and what does its extended projection minimally contain? This will be one of the main issues to be explored in this paper. Philosophies 2021, 6, 70 5 of 29

• Non-local Dependency Corollary: Non-local dependencies always reduce to local ones once the recursive structure is factored out [21] (p. 237). What the CETM states is that a particular lexical head H may entail a functional shell made of the (functional) projections that are assumed to surround it [21] (p. 239). For instance, V might include even T and C as they belong to the extended domain of V, i.e., the domain where selection, agreement, feature inheritance/percolation occur (in this sense, it may be that under certain formulations of the CETM, the extended projection of a V could coincide with its s-projection, in the sense of [24]). At first glance, the fact that a V is endowed with all that functional material might appear counterintuitive, especially from a phase-theoretic perspective, which crucially assumes that v * and C are the two (main) phases. However, the idea of the extended projections of a head H belonging in the elementary tree of H allows us to have a natural definition of local domains that do not depend on designated nodes and is a direct consequence of CETM as formulated by Frank. 1 As long as an elementary tree does not contain more than one lexical head, it is a perfectly well-formed structure from a TAG perspective; this makes the connection between lexicality/functionality and locality much more natural than in minimalism. What becomes crucial in an LTAG is to have a proper definition of the lexical category, which is independently required in the grammar. After all, in a theory where functional material is the centre of attention, an intentional or extensional definition of what exactly is the functional lexicon of a language also defines, as the complement set, the “lexical” lexicon. In the present perspective, two notions become essential; lexicality and, as we will see below, predication. Note that, under the CETM as presented above, V’s nominal arguments should define independent elementary trees since N is a lexical head (with D, Num, Deg and other functional heads being part of its extended projection). We will see that it is possible to modify the CETM in such a way that it becomes more restrictive with respect to the set of categories that can head elementary trees; this modification, crucially, is substantive, not formal. The generative power of grammar remains unchanged. In our opinion, the choice between a more or a less restrictive version of the CETM is an empirical issue and may depend on language-specific properties. In order to have an empirical bite, the Fundamental TAG hypothesis and the Non- local dependency corollary require characterisation of the structural domain that com- prises an elementary tree. Just as the context-free formalism tells us nothing about the kind of phrase-structure rules that should be present in grammar, the TAG formalism says little about the nature of elementary trees. In the words of [25] (pp. 2–3): “TAG is a formal system that combines elementary trees, but it is not a linguistic theory and imposes no constraints upon the nature of [ . . . ] elementary trees”. Consequently, from the perspective of formalism, the grammar of a language could simply consist of a list of an arbitrary finite set of (finite) trees. Such an approach would, how- ever, fail to account for the generalisations that hold across the structures for a particular language, or across languages in general, unless independently motivated constraints are imposed on the nature of elementary trees: what linguistically relevant categories they can and cannot contain. Just as the theory of generalised (GPSG) [26] aimed to characterise the set of phrase-structure rules in the grammar of a language, a TAG-based theory of syntax must provide an independent characterisation of the set of elementary trees. Since the adoption of the TAG formalism commits us to the invariant adjoining and substitution operations, the definition of elementary trees in the grammar of a language is the locus of cross-linguistic differences, imposing substantial restrictions on the kind of variation that we may find. This is a particularly relevant point since different versions of TAGs may impose different restrictions. In a lexicalised TAG (that is, a TAG that adheres to principle (b) above), this entails that a source of variation may be precisely the definition of what constitutes a lexical category. In other words, the CETM gives us a framework in which to formulate a theory of configurational variation. The case for a Philosophies 2021, 6, 70 6 of 29

lexicalised grammar that we make in this paper is illustrated with the analysis of empirical data pertaining to Spanish auxiliary chains, which is contrasted with what an empirically adequate grammar for English auxiliary chains. LTAGs offer an interesting direction for minimalist inquiry in terms of grammatical analysis and variation. In this paper, we examine two issues related to the TAG principles above: 1. What is the format of elementary trees? (Or, in other words, how many nodes does an extended projection need, and what are those nodes?). 2. How do non-local dependencies reduce to local ones? What exactly is a “local” dependency? Philosophies 2021, 6, x FOR PEER REVIEW 7 of 33 We will address these issues in the following sections.

3. What Is Inside an Elementary Tree? means that, to the extent that it yields a descriptively adequate grammar, we want to min- The first question we need to address is what elementary trees may contain. The TAG imise theformalism number of does projections not impose, in particular, formal a priori empty conditions nodes and on a thepriori content functional of elementary mate- trees, rial. Thiswhich is consistent means with that the basic substantive goals of definition lexicalised of grammar elementary but trees may becomesbe interpreted an eminently in differentempirical ways. One matter. way We of maylexicalising begin by a layingTAG is out to make a meta-theoretical all lexical heads requirement: (predicates we aim at and arguments)minimising have the an number extended of non-terminal projection [19,28,29] nodes in but elementary not functional trees. We heads. adhere In tothis a principle case, a TAGof economy is lexicalised of expression if every elementary [27]; it can bestructure seen as is a associated form of syntactic with a single pruning. lexical This means item. Anotherthat, to is the to have extent functional that it yields heads a descriptively, as well as lexical adequate heads grammar,, head elementary we want to trees minimise the [28,30,31]number. Finally, of it projections, is also possible in particular, to restrict empty the class nodes of lexical and a heads priori functionalthat can project material. an This is elementaryconsistent tree: not with all lexical the basic heads, goals but of only lexicalised lexical grammarpredicates but [32,33 may]. beIn interpretedthe latter case, in different an elementaryways. Onetree is way the of smallest lexicalising structure a TAG that is to contains: make all lexical heads (predicates and arguments) have an extended projection [19,28,29] but not functional heads. In this case, a TAG is i. A predicativelexicalised basic if every expression elementary p. structure is associated with a single lexical item. Another is ii. Temporalto have and functional aspectual heads, modifiers as well of asp (e.g., lexical functional heads, head auxiliaries, elementary in [34]) trees. [28,30,31]. Finally, iii. Nominalit is also arguments possible of to p restrict(subjects, the objects) class of. lexical heads that can project an elementary tree: not all lexical heads, but only lexical predicates [32,33]. In the latter case, an elementary tree is the smallest structure that contains: In this view, it is not enough for a head to belong to the set of commonly assumed i. A predicative basic expression p. lexical categories (N, V, A, Adv, P). In order to head an elementary tree (in TAG parlance, ii. Temporal and aspectual modifiers of p (e.g., functional auxiliaries, in [34]). be the “anchor” of that tree), that head must also be a predicate: it must, in GB terms, have iii. Nominal arguments of p (subjects, objects). a thematic grid and a subcategorisation frame. Note that the elementary trees predicted by each approachIn this are view, different: it is notwhile enough a strict for version a head of to the belong CETM to requires the set ofnominal commonly argu- assumed ments tolexical head their categories own elementary (N, V, A, Adv, trees P). if the In order focus tois headset on an predication elementary rather tree (inthan TAG just parlance, lexicalitybe, the the elementary “anchor” of trees that tree),defined that by head themust grammar also be look a predicate: quite differe it must,ntly; in we GB will terms, have comparea these thematic possibilities grid and below. a subcategorisation A theory of this frame. kind Note is prefigured that the elementary in [21], where trees predictedit is by claimed thateach “ approachan elementary are different: tree should while be a built strict around version a of single the CETM semantic requires predicate nominal”; we arguments merely implementto head their the own condition elementary that treesthe anchor if the focus of an is elementa set on predicationry tree is only rather a thansemantic just lexicality, predicate.the We elementary can consider trees a definedvery simple by the example, grammar such look as quite the differently; “John we willhas comparerun”, these to illustratepossibilities each approach below. A(the theory specific of this labels kind used is prefigured here have in no [21 particular], where it theoretical is claimed that “an elementary tree should be built around a single semantic predicate”; we merely implement significance). the condition that the anchor of an elementary tree is only a semantic predicate. We can First, let us consider a proposal such as that in [30,31] ([21]; see also [29], p. 5), as in consider a very simple example, such as the sentence “John has run”, to illustrate each Figure 6. approach (the specific labels used here have no particular theoretical significance). First, let us consider a proposal such as that in [30,31] ([21]; see also [29], p. 5), as in Figure6.

Figure 6. Elementary trees following [30,31].

Figure 6. Elementary trees following [30, 31].

Note that in (Figure 6), not only lexical categories N and V head their own elementary trees, but the auxiliary have also has, which is a T head (thus, functional). Elementary trees can also be structured as in [19,21], see Figure 7.

Philosophies 2021, 6, 70 7 of 29

Philosophies 2021, 6, x FOR PEER REVIEW 8 of 33 Note that in (Figure6), not only lexical categories N and V head their own elementary Philosophies 2021, 6, x FOR PEER REVIEW 8 of 33 trees, but the auxiliary have also has, which is a T head (thus, functional). Elementary trees can also be structured as in [19,21], see Figure7.

Figure 7. Elementary trees following [19, 21] FigureFigure 7. Elementary 7. Elementary trees following trees following [19, 21][19 ,21]. In this case, only lexical categories head independent elementary trees: the DP is the extendedIn this Inprojection case, this only case, of lexical only the lexical categories category categories head N, independent head and independentthe TP iselementary the extended elementary trees: projection trees:the DP the is of DPthe the is the extendedlexicalextended V projection(see also projection [24,25 of the]). oflexical the lexical category category N, and N, the and TP the is the TP isextended the extended projection projection of the of the lexicallexicalFinally, V (see V alsothe (see last [24,25 also option []).24 ,25 we]). will consider here is the proposal in [32,33], as in Figure 8. Finally,Finally, the last the option last option we will we consider will consider here is here the isproposal the proposal in [32,33 in []32, as,33 in], Figure as in Figure 8. 8.

Figure 8. Elementary tree following [32,33]. Figure 8. Elementary tree following [32, 33] Figure 8. ElementaryThe third tree possibility following [32, is that33] only a lexical predicate (as opposed to any lexical head) definesThe third an possibility elementary is tree. that Figure only a8 lexicalcontains predicate only one (as such opposed predicate to any ( run lexical), with head) perfective definesThehave anthirdand elementary thepossibility subject tree is DP .that Figure being only part8 acontains lexical of the predicate extendedonly one such projection(as opposed predicate of to this (anyrun predicate. ),lexical with perfec-head) definestive have an and Theelementary the following subject tree subsectionsDP. Figure being 8part contains will of deal the only withextended one the applicationsuch projection predicate of thesethis (run predicate.), systems with perfec-to phenom- tive haveenaThe and infollowing Spanish the subject subsections and DP German, being will part focused deal of withthe on extended thethe structureapplication projection of of auxiliary these of this systems verb predicate. constructions to phenom- and ena Theinthe Spanish following possibilities and subsections German, that a lexicalised focused will deal on TAG with the offer structurethe application for the of development auxiliary of these verb systems of con a comparativestructions to phenom- and agenda. enathe inpossibilities Spanish and that German, a lexicalised focused TAG on offer the structurefor the development of auxiliary of verb a comparative constructions agenda. and the possibilities3.1. Elementary that a Trees lexicalised in Spanish TAG Auxiliary offer for Chains the development of a comparative agenda. 3.1. ElementaryAs we T saidrees in above, Spanish we Auxiliary need to defineChains exactly what it means for a node to be a lexical 3.1. Elementarypredicate:As we said T [rees34 above,] andin Spanish subsequentwe need Auxiliary to define works Chains focusedexactly what on the it syntaxmeans offor the a node verbal to domain, be a lexical a lexical predicate:Aspredicate we [34said] and isabove, defined subsequent we asneed a works syntacticto define focused objectexactly on that whatthe syntax can it means both of the modify for verbal a node another domain, to be objecta alexical lexical and be predicate:predicatemodified [34is defined] and itself. subsequent as In a syntactic the case works ofobject focused perfect that have,oncan the bothit syntax modifies modify of the another the verbal lexical object domain, verb and but a be lexical cannotmod- be predicateified modifieditself. is definedIn the itself. case as Suppose aof syntactic perfect that have,object we hadit that modifies the can sentence both the modifylexical “John verbanother will but have objectcannot run”: and be in bemodi this mod- case,fied the will ifieditself. itself.temporal Suppose In the (future)that case we of information hadperfect the have,sentence conveyed it modifies “John by will the havelexical[35] run modifies verb”: in but this thecannot case, event bethe denotedmodi temporalfied by the lexical verb, as does the perfect aspect. Furthermore, have does not take nominal arguments itself.(future) Suppose information that we conveyed had the sentenceby will [35] “John modifies will havethe event run” :denoted in this case, by the the lexical temporal verb, or assign thematic roles: in this sense, it fits the traditional characterisation of auxiliaries (future)as does information the perfect aspect. conveyed Furthermore, by will [35] have modifies does thenot eventtake nominal denoted ar byguments the lexical or assignverb, thematicas elements roles: in thatthis sense, lack argument it fits the traditional structure and characterisation thematic grid. of auxiliaries Thus, perfective as elements have is a as does the perfect aspect. Furthermore, have2 does not take nominal arguments or assign thematicthat lackpredicate, roles: argument in yes, this butstructure sense, not it a fits lexicaland the thematic one.traditional grid. characterisation Thus, perfective of haveauxiliaries is a predicate, as elements yes, thatbut lacknot aargument lexical one. structure2 and thematic grid. Thus, perfective have is a predicate, yes, but not a lexical one.2 2 This system, as it is based on two binary dimensions, predicts the existence of four kinds2 This of syntactic system, categories:as it is based on two binary dimensions, predicts the existence of four kinds• Lexicalof syntactic predicates categories: (V, A, P, nominalisations, lexical auxiliaries). •• LexicalLexical predicates non-predicates (V, A, (proper P, nominalisations, N, non-nominalised lexical auxiliaries). common N). •• LexicalNon-lexical non-predicates predicates (proper (functional N, non auxiliaries,-nominalised quantifiers). common N). • Non-lexical predicates (functional auxiliaries, quantifiers).

Philosophies 2021, 6, x FOR PEER REVIEW 9 of 33

Philosophies 2021, 6, 70 8 of 29

Whereas the functional nature of perfective have (or its counterpart haber in Spanish) is hardlyWhereas contested, the the functional case of nature modal of auxiliaries perfective have is more(or its complicated counterpart haber and inthus Spanish) requires moreis detailed hardly contested, discussion. the Let case us of consider modal auxiliaries the derivation is more proposed complicated by andFrank thus for requires a sentence suchmore as, “ detailedThe senator discussion. should Let read us considera speech the”: derivation proposed by Frank for a sentence such as, “The senator should read a speech” as in Figure9:

Figure 9. Derivation of ‘the senator should read a speech’ according to [21].

Here, the extended projection of read (the minimal elementary tree containing read) contains should and excludes the nominal arguments (since N is a lexical category, and as Figuresuch, 9. Derivation defines its of own ‘the senator elementary should tree). read Ina speech’ a representation according to along [21] the lines of [32,33] for the English example, there would be no substitution needed for the nominal arguments (since theHere, definition the extended of an elementary projection tree of includesread (the not minimal only functional elementary modifiers tree containing of the lexical read) containspredicate shoul butd and also excludes its arguments), the nominal but otherwise, arguments the structure (since N would is a lexical look very category, much theand as suchsame., defines However, its own when elementary we consider tree). the In correspondinga representation Spanish along example, the lines things of [32,33 become] for the Englishmore example, complex: there would be no substitution needed for the nominal arguments (since the definitionEl of an senador elementary deber treeía includes leer not only functional un modifiers discurso. of the lexical predicateThe but also senator its arguments), must-COND but otherwise read-INF, the structure a would speech. look very much(1) the same.The However, senator should when read we a speech.consider the corresponding Spanish example, things become more complex:The main question here is whether the auxiliary deber (‘must’) belongs in the same elementary tree as the lexical verb leer (‘to read’). Under the definition of elementary trees (1) suggested above, if there is a functional predicate that modifies deber but does not modify leer,El then senador it means debería that deber is indeedleer a lexical un predicate discurso and,. as such, may head its own elementaryThe senator tree. Consider,must-COND then, theread case-INF in (2) a (for thespeech deontic. reading of deber): TheEl senatorsenador shouldha read a speech. debido leer un discurso. The senator have-3SG-AUX-PRF must-PART read-INF a speech. (2) The senator has had to read a speech. The main question here is whether the auxiliary deber (‘must’) belongs in the same elementaryThere tree are as two the related lexical arguments verb leer in(‘to favour read’) of. theUnder claim the that definition the perfect of does elementary not modify trees the lexical VP, just the modal auxiliary: one is semantic, the other configurational. We suggested above, if there is a functional predicate that modifies deber but does not modify may illustrate the point using a proof by contradiction of sorts. Assume, for the sake of leer, argument,then it means that the that structure deber is of indeed (1) involves a lexic a cartographic-styleal predicate and syntax:, as such monotonically, may head its and own elementaryuniformly tree. binary-branching Consider, then, and the highly case articulated,in (2) (for the based deontic on a fixed reading universal of deber functional): hierarchy as diagrammed in Figure 10: (2) El senador ha debido leer un discurso. The senator have-3SG-AUX-PRF must-PART read-INF a speech. The senator has had to read a speech.

There are two related arguments in favour of the claim that the perfect does not mod- ify the lexical VP, just the modal auxiliary: one is semantic, the other configurational. We may illustrate the point using a proof by contradiction of sorts. Assume, for the sake of argument, that the structure of (1) involves a cartographic-style syntax: monotonically and

• Non-lexical, non-predicates (Complementisers).

Philosophies 2021, 6, x FOR PEER REVIEW 10 of 33

Philosophies 2021, 6, 70 9 of 29 uniformly binary-branching and highly articulated, based on a fixed universal functional hierarchy as diagrammed in Figure 10:

Figure 10. Cartographic clausal hierarchy. Figure 10. Cartographic clausal hierarchy In Figure 10, the spine that corresponds to the sequence of auxiliaries and the VP In Figuregrows 10,monotonically the spine that bycorresponds introducing to athe single sequence terminal of auxiliaries node per derivationaland the VP step in an grows monotonicallyorder that follows by introducing an a priori a fixedsingle universal terminal hierarchynode per (e.g.,derivational [9], p. 133; step [10 in,11 an]). The issue order that followsarises when an a wepriori evaluate fixed universal the syntactic hierarchy and semantic (e.g., [9], predictions p. 133; [10,11 that]). such The a structureissue makes arises whenwith we respect evaluate to our the Spanish syntactic cases. and semanticIn a configuration predictions such that as Figuresuch a 10 structure, given a standard makes withsyntactic respect definitionto our Spanish of scope cases. [36 ],In wea configuration expect the perfective such as auxiliaryFigure 10haber, givento a have scope standard syntacticover the definition modal deber of scopeand the [36], lexical we expect verb leer the. perfective If it did, then auxiliary (1) should haber entail to have (3), contrary scope overto the fact: modal deber and the lexical verb leer. If it did, then (1) should entail (3), contrary to fact:

Philosophies 2021, 6, 70 10 of 29

El senador ha leído un discurso. The senator has-3SG-AUX-PRF read-PART a speech. (3) The senator has read a speech. In (3), the event denoted by the lexical VP is modified by the perfective auxiliary, and since active participles in Spanish have an anteriority interpretation (for Spanish, see [33]; for English, [37]), then we would expect the VP to be interpreted as a past event. This interpretation, furthermore, is a consequence of the assumption that the clausal structure is a monotonically growing binary-branching skeleton. Why? Because X c-commands all and only the terms of the category Y with which X was paired (by Merge or by Move) in the course of the derivation ([38], p. 329). This means that under a cartographic view of phrase structure, it is unavoidable that in (8) haber c-commands leer and thus has scope over it. However, such a syntactic configuration would predict the wrong semantic interpretation: only the modal deber is affected by the perfect and receives an anteriority interpretation. In other words: it is only the obligation to read a discourse that is affected by the perfect, not the event of reading itself (which may or may not have taken place). As argued in [32], one of the major differences between the English and Spanish auxiliary systems lies in the fact that Spanish modals are morphologically unremarkable as verbs, in the sense that they have full inflectional paradigms (of both finite and non-finite forms). As such, Spanish modals can appear as complements to any other auxiliary since they have a full set of non-finite forms. Thus, since the modal deber has a participle, it may be modified by perfective haber: semantically, this combination is interpreted as a possibility (under the epistemic reading of deber) holding in the past. Crucially, a structural description such as (9) (maybe with less functional projections, but configurationally identical) is unavoidable under currently standard minimalist as- sumptions: every tree is at every point binary-branching because the structure-building mechanism (generalised transformations, as in [15,16]; Merge, as in [39]; MERGE as in [40]) can only manipulate two objects at a time. Furthermore, because of the assumption of struc- tural uniformity that is at the core of minimalist theorising, there is no room for a different kind of structural description. If we require a theory of the grammar to be descriptively adequate, then there are reasons to reject Figure 10 as an adequate structural description for (1) on semantic bases (to the extent that these depend on syntactic configuration). Here is where a TAG treatment becomes especially attractive: a TAG allows us to define more than a single elementary tree to compose a structural description. Therefore, if we can define a syntactic object that includes (properly contains) ha debido and excludes leer un discurso, we can create a configuration where the problematic scope relation identified above does not hold. 3 We can do exactly this if we assume that Spanish modals, unlike their English counter- parts, are lexical predicates, and as such, may head their own elementary trees. Therefore, where English needs a single elementary tree, Spanish may require more. Summarising a lot of previous work, this is a tenable hypothesis if we consider that (i) Spanish modals, un- like English modals, are morphologically unremarkable verbs, thus (ii) they can be selected by other auxiliaries (when they are, they “absorb” the information conveyed by auxiliaries to their left [24,32]), and (iii) it has been argued that they can be thematic assigners, just like lexical verbs. We know they are auxiliaries because they can combine with meteorological verbs and because of their semantic contribution to the compositional interpretation of the structures in which they appear (epistemic/deontic/dynamic in the most widely accepted classifications). Thus, we can define the (slightly simplified) elementary trees in Figure 11: Philosophies 2021, 6, x FOR PEER REVIEW 12 of 33

Summarising a lot of previous work, this is a tenable hypothesis if we consider that (i) Spanish modals, unlike English modals, are morphologically unremarkable verbs, thus (ii) they can be selected by other auxiliaries (when they are, they “absorb” the information conveyed by auxiliaries to their left [32,24]), and (iii) it has been argued that they can be thematic assigners, just like lexical verbs. We know they are auxiliaries because they can combine with meteorological verbs and because of their semantic contribution to the com- positional interpretation of the structures in which they appear (epistemic/deontic/dy-

Philosophies 2021, 6, 70namic in the most widely accepted classifications). Thus, we can define the (slightly sim- 11 of 29 plified) elementary trees in Figure 11:

Figure 11. Elementary trees for ‘El senator ha debido leer un discurso’.

Under the assumption that Spanish modal auxiliaries are lexical predicates, it becomes Figure 11. Initial and Auxiliary Trees for ‘El senator ha debido leer un discurso’ possible for us to define an auxiliary tree that contains the modal and its functional modifier, the perfective auxiliary haber, and which excludes, naturally, the lexical verb. We can Under the assumption that Spanish modal auxiliaries are lexical predicates, it be- provide further evidence that the perfect only affects the deontic modal and not the lexical comes possible for us to define an auxiliary tree that contains the modal and its functional verb: there is no actuality entailment in (1). Therefore, (4) is not contradictory: modifier, the perfective auxiliary haber, and which excludes, naturally, the lexical verb. El senadorWe can provide ha further evidence that debido the perfect leer only affects un the deontic discurso, modal and pero not The senatorthe lexical have-3SG-AUX.PERF verb: there is no actuality must-PART entailment read-INFin (1). Therefore, a (4) is speech, not contradictory: but no lo ha hecho (4) NEG CL-ACC have.3SG-AUX-PERF do-part. (4) “The senator has hadEl to read asenador speech, but ha he hasn’t done so” debido leer un discurso, pero The senator have-3SGAUX.PERF must-PART read-INF a speech, but The interpretation of (1) is one in which only the obligation conveyed by the deontic no lo ha hecho modal receives an anteriority interpretation due to the perfective auxiliary, but not the NEG CL-ACC have.3SG-AUX-PERF do-PART. event denoted by the lexical verb. The same situation arises with temporal auxiliaries; “The senator has had to read a speech, but he hasn’t done so’” consider the case (5), which combines future auxiliary with a dynamic modal poder: The interpretation of (1) is one in which only the obligation conveyed by the deontic El senador va a poder leer un discurso. modal receives an anteriority interpretation due to the perfectivebe able auxiliary, but not the (5) The senator goes-to.3SG-AUX-FUT read-INF a speech. event denoted by the lexical verb. The same situation arisesto-INF with temporal auxiliaries; consider the case“The (5), senator which will combines be able tofuture read aauxiliary speech” with a dynamic modal poder: In (5), it is clear that what is temporally anchored in the future is not the event of reading, which may or may not take place at all, but the senator’s capacity(5) to engage in El senadorsuch va an a event of reading. poder To capture this reading,leer we propose,un discurso it is necessary. to define a The senatorsyntactic goes-to. object3SG-A thatUX- includesFUT be theable temporal to-INF auxiliaryread-INF and a the modalspeech but. excludes the lexical “The senatorverb, will a configuration be able to read that a isspeech unavailable” given the structure in Figure 10.4 The non-monotonicity of TAG derivations, combined with a lexicalised approach to In (5), it theis clear substantive that what definition is temporally of elementary anchored trees, in the allows future us tois providenot the anevent adequate of structural reading, whichdescription may or may for not the relationstake place between at all, but auxiliaries the senator’s in Spanish capacity chains. to engage in such an event of reading. To capture this reading, we propose, it is necessary to define a 3.2. German Modals: Different Elementary Trees? When it comes to verb clusters in Continental West Germanic, variation seems to play a major role. The OV nature of these languages causes both finite and non-finite verb forms to show up in a clause-final stack with different (morpho-syntactic) properties and different reordering rules. Essentially, the analyses proposed in the early 1980s boils down to the concept of verb raising (VR). The standard analysis of this phenomenon is put forward in [41]: assuming a biclausal analysis of verb clusters, the embedded verb is taken to be extracted from the complement clause and Chomsky-adjoined to the matrix verb (to its left in German and to its right in Dutch): Verb Raising:

V1]S V2

a.... e1][V1 V2]V (German) b.... e1][V2 V1]V (Dutch) Along the lines of Evers, a derivation in Dutch might look like (6): Philosophies 2021, 6, 70 12 of 29

a. . . . dat Jan [PRO [ een huis kopen]VP]S wil (D-Structure.) that Jan a house buy wants.3SG (6) b. . . . dat Jan [PRO [ een huis e] VP]S wil kopen (VR) that Jan a house want.3SG buy “that Jan wants to buy a house” The papers [42,43] represented a major breakthrough in the treatment of VR within the GB approach. In particular, the latter assumed that this phenomenon consisted of two separate processes: (i) syntactic reanalysis of the verb sequence resulting in clause union and (ii) inversion of the constituents in the sequence. Moreover, the so-called infinitivus participio (IPP) construction represented (and in fact still represents) a thorn in the side of scholars who specialise in West Germanic: these constructions feature a V1-V3-V2 (Aux-V-Mod) sequence in embedded clauses, where the modal verb resists a participial (for instance, gemusst shows up as müssen) and is not adjacent to the auxiliary that selects it (e.g., hätte schreiben müssen, “would.have write must.INF”). Over the years, different approaches have attempted to tackle the anomalous sequence V1-V3-V2 that does not comply with the expected canonical V3-V2-V1 order of German (we refer the reader directly to [44]). The paper [45] concentrated on the “verb raising” nature of West Germanic verb complexes and, following the lead of [43], emphasised two specific aspects of the com- plexes: (i) the type of complement selected by modals and auxiliaries and (ii) the rule of extraposition (via Chomsky-adjunction) that some types of infinitives were assumed to undergo. We will see that (i) plays a major role the complements being either S-nodes or VPs while (ii) was assumed to account for the great deal of variation in West Germanic verbal complexes: in fact, if we stick to the hypothesis that modals, causatives and per- ception verbs pattern in the same way, coherently subcategorising for VPs and optionally undergoing raising to a higher V, then, the specific locus of adjunction and the possibility of permutations (V3-V2-V1/V1-V3-V2 etc.) become the main ingredient of cross-linguistic variation. Take, for instance (7a,b) that features cluster perception-causative-V: a. . . . dass Hans Peter Marie schwimmen lassen sah (Germ.) that Hans Peter Marie swim let saw.3SG-PST “that Hans saw Peter let Marie swim” (7) zwemmen b. . . . dat Jan Piet Marie zag laten (Dut.) that Jan Piet Marie saw.3SG-PST let swim “that Jan saw Piet let Marie swim” As we have seen in (6) and (7), the order of Vs in the Dutch complex—which is the mirror image of its German counterpart—is derived by assuming infinitive extraposition. On the contrary, German displays a coherent OV final ordering in this case: extraposition was assumed, e.g., for modal perfective constructions where the finite V precedes the 2-infinitive complex. Incidentally, that the Dutch example features crossing dependencies between two categories (in abstract terms, using integers to indicate dependencies between nouns and verbs, N1 N2 N3 V1 V2 V3), the hallmark of mild-context-sensitivity. In contrast, the German example expresses the same proposition by means of center embedding (N1 N2 N3 V3 V2 V1), which is strictly context-free. The reader can see that the English translation only makes use of tail recursion (N1 V1 N2 V2 N3 V3), a formal mechanism available, in principle, in regular grammars. Given their explicitness and flexibility in the definition of elementary trees, LTAGs are particularly well-suited to represent this cross-linguistic computational variation. Traditional assumptions on extraposed infinitives/verb raising can also be expressed in terms of a lexicalised TAG (i.e., the perspective they adopt in their paper) resorting specifically to Trees with links, “[...] where link refers to the relationship between an and the Chomsky-adjoined antecedent that binds it” ([45], p. 310), see Figure 12: Philosophies 2021, 6, x FOR PEER REVIEW 15 of 33

PhilosophiesPhilosophies2021 2021,6 6, x FOR PEER REVIEW 15 of 33 , , 70 13 of 29

Figure 12. Elementary tree with VR

Thus, the differences between German and Dutch might be captured with structur- ally identical trees, whereFigure different 12. Elementary adjunction tree withsites VR (structurally. adjacent S-nodes) are as- Figure 12. Elementary tree with VR sumed: this approach was defined as a string-vacuous raising hypothesis and apparently accounts for cross-linguisticThus, variation the differences in an elegant between and Germanstraightforward and Dutch way might as it be “re- captured with structurally Thus, the differences between German and Dutch might be captured with structur- duce[s] variation in theidentical trees, of where verb sequences different adjunction to the choice sites of (structurally node at which adjacent aux- S-nodes) are assumed: ally identical trees, where different adjunction sites (structurally adjacent S-nodes) are as- iliary trees are adjoinedthis [...]” approach ([47], p. was 317). defined However, as a string-vacuousKroch and Santorini raising point hypothesis out that and apparently accounts sumed: this approach was defined as a string-vacuous raising hypothesis and apparently such an assumption mightfor cross-linguistic imply serious structural variation inissues: an elegant for instance, and straightforward a PRO subject way of as it “reduce[s] varia- accounts for cross-linguistic variation in an elegant and straightforward way as it “re- a modal might even endtion up inc-commanding the word order its of antecedent verb sequences (which to would the choice make of it node impossi- at which auxiliary trees are duce[s] variation in the word order of verb sequences to the choice of node at which aux- ble to establish an appradjoinedopriate binding [...]” ([45 relation).], p. 317). Therefore, However, they Kroch put and forward Santorini a different point out that such an assump- tioniliary might trees implyare adjoined serious [...]” structural ([47], p. issues: 317). forHowever, instance, Kroch a PRO and subject Santorini of a point modal out might that analysis that discards string-vacuous verb raising and uniformly assumes that modals evensuch endan assumption up c-commanding might imply its antecedent serious structural (which would issues: make for instance, it impossible a PRO to subject establish of (and “functional” verbs) uniformly subcategorise for VP complements. Their proposal ana modal appropriate might even binding end relation).up c-commanding Therefore, its they antecedent put forward (which a would different make analysis it impossi- that will be spelt out with a Franconian example (8a) and a standard German example (8b): discardsble to establish string-vacuous an appr verbopriate raising binding and relation). uniformly Therefore, assumes that they modals put forward (and “functional” a different verbs)analysis uniformly that discards subcategorise string-vacuous for VP complements. verb raising and Their uniformly proposal(8) willassumes be spelt that out modals with a. …dass era(and Franconian singen“functional example hat” verbs) müssen (8a) uniformly and a standard subcategorise German example for VP complements. (8b): Their proposal a. . . . dass that hewill besing er spelt out haswith a singenhave Franconian-to-INF example hat (8a) and a s müssentandard German example (8b): “that hethat had to sing” he sing has have-to-INF (8) “that he hadb to. sing”…dass sie hätte schreiben können a. …dass er singen hat müssen (8) b. . . . dass that she have sie-SUBJ write hätte be-able schreiben-to können that he sing has have-to-INF “that shethat could have written she” have-SUBJ write be-able-to “that he had to sing” “that she could have written” b. …dass sie hätte schreiben können First of all, the modal müssenFirst of and all, that the modalauxiliaryshemüssen havehat areand-SUBJ taken the write auxiliary to head hattheir are own takenbe -auxil-able to- headto their own auxiliary iary trees (note the absencetrees “of (notethat extrapositions) she the could absence have, ofwhereas extrapositions),written the” lexical whereas verb is represented the lexical verb in is represented in the the initial tree with rootinitial S: tree with root S as in Figure 13:

First of all, the modal müssen and the auxiliary hat are taken to head their own auxil- iary trees (note the absence of extrapositions), whereas the lexical verb is represented in the initial tree with root S:

Figure 13. Elementary treesFigure for the 13.examplesElementary in (8) trees for the examples in (8). We can now obtain (8a) by adjoining the first auxiliary tree in the VP node of the initial We can now obtain (8a) by adjoining the first auxiliary tree in the VP node of the tree and the second auxiliary tree in the VP * node, which has become the lower VP node initial tree and the second auxiliary tree in the VP * node, which has become the lower VP afterFigure the 13 first. Elementary adjoining. trees Thisfor the is examples represented in (8) in Figure 14: node after the first adjoining. This is represented in Figure 14: We can now obtain (8a) by adjoining the first auxiliary tree in the VP node of the initial tree and the second auxiliary tree in the VP * node, which has become the lower VP node after the first adjoining. This is represented in Figure 14:

Philosophies 2021, 6, x FOR PEER REVIEW 16 of 33

Philosophies 2021, 6, x FOR PEER REVIEW 16 of 33

Philosophies 2021, 6, 70 14 of 29

Figure 14. Derived tree after adjunction. Figure 14. Derived tree after adjunction. A certain degree of variation still remains in Kroch and Santorini’s [47] model as the standardFigure 1German4. Derived verb treeA certainaftercomplex adjunction degree V1-V. of 3- variationV2 (i.e., the still IPP remains construction) in Kroch and requires Santorini’s Chomsky [45] model- as the adjunction of thestandard infinitive German to VP verbin the complex elementary V1-V3 -Vtree.2 (i.e., We the can IPP provide construction) the elementary requires Chomsky- adjunction of the infinitive to VP in the elementary tree. We can provide the elementary trees as Ain certainFigure degree15: of variation still remains in Kroch and Santorini’s [47] model as the standard Germantrees asverb in Figurecomplex 15: V1-V3-V2 (i.e., the IPP construction) requires Chomsky- adjunction of the infinitive to VP in the elementary tree. We can provide the elementary trees as in Figure 15:

Figure 15. Elementary trees featuring V extraposition.

The two auxiliary trees are assumed to adjoin at the VP * and VP+ sites in the initial Figure 15. Elementarytree, trees where featuring the symbols V extraposition * and + are originally used in [45], p. 322, as diacritics for nodes that are targeted by adjunction. The resulting tree composition is (Figure 16): The two auxiliary trees are assumed to adjoin at the VP * and VP+ sites in the initial tree,Figure where 15 .the Elementary symbols trees * and featuring + are V originally extraposition used in [47], p. 322, as diacritics for nodes that are targeted by adjunction. The resulting tree composition is (Figure 16): The two auxiliary trees are assumed to adjoin at the VP * and VP+ sites in the initial tree, where the symbols * and + are originally used in [47], p. 322, as diacritics for nodes that are targeted by adjunction. The resulting tree composition is (Figure 16):

Philosophies 2021, 6, x FOR PEER REVIEW 17 of 33

Philosophies 2021, 6, 70 15 of 29

Figure 16. Derived tree with V extraposition.

Figure 16. Derived treeKroch with andV extraposition Santorini’s conclusion is that variation arises due to two different factors: (a) the choice between elementary trees with or without extraposed infinitives and (b) the Kroch andchoice Santorini’s of different conclusion nodes asis that sites variation for adjunction arises (thedue specificto two different “height” factors: of the target nodes (a) the choicewithin between the elementary initial tree). trees At with any or rate, without discarding extraposed the string-vacuous infinitives and analysis(b) the is a better choice of differentsolution nodes from as asites theoretical for adjunction perspective: (the specific moreover, “height” the learners of the target would nodes not be forced to within the initialpostulate tree). At the any existence rate, discarding of trees with the string links- ([vacuous45], p. 323).analysis (We is express a better oursolu- gratitude to a tion from a theoreticalreviewer for perspective: pointing this moreover, out to us). the learners would not be forced to postu- late the existence ofOne trees might with observe links ([47], that p. our 323). proposal (We express does our not gratitude rigidly constrain to a reviewer the possible per- for pointing thismutations out to us) that. can be obtained by adjoining at different adjunction sites: however, the One mightdegree observe of variation that our is proposal not unconstrained. does not rigidly Suppose constrain we maintain the possible the derivational permu- order “V tations that canextraposition be obtained > by adjunction” adjoining at assumed different by adjunction Kroch and sites: Santorini. however, In this the case,degree the only other of variation ispossibility not unconstrained. we are left Suppose with, apart we maintain from (8), the is adjoining derivational the hätteorder-auxiliary “V extra- tree first and position > adjunctionafterwards” assumed the können by-auxiliary Kroch and tree Santorini in the same. In this adjunction case, the sites. only Whatother wepossi- get is the string bility we are* leftkönnen with schreiben, apart from hätte (8)(V,2 -Vis adjoining3-V1), which the is hätte excluded-auxiliary in standard tree first Germanand after- and extremely wards the könnenrare but-auxiliary still possible tree in in the some same northern adjunction varieties sites. (especially What we Dutch).get is the Conversely, string suppose *können schreibenwe allow hätte ourselves(V2-V3-V1), the which opposite is excluded derivational in standard order, German i.e., “adjunction and extremely > V extraposition”: Philosophies 2021, 6, x FOR PEER REVIEW knowing that adjunction bleeds extraposition (since once we have adjoined18 of the 33 auxiliary rare but still possible in some northern varieties (especially Dutch). Conversely, suppose we allow ourselvestrees, there the isopposite no possible derivational target for order, V extraposition), i.e., “adjunction we are > nowV extraposition left with a new”: initial tree knowing that(Figure adjunction 17) that bleeds can extraposition be the target of(since adjunction once we for hav thee sameadjoined auxiliary the auxiliary trees of (Figure 15): trees, there is no possible target for V extraposition), we are now left with a new initial tree (Figure 17) that can be the target of adjunction for the same auxiliary trees of (Figure 15):

Figure 17.Figure Elementary 17. Elementary tree without tree V withoutextraposition V extraposition .

Adjoining the two auxiliary trees to the same initial tree in different orders, we obtain the strings schreiben hätte können (V3-V1-V2) and schreiben können hätte (V3-V2-V1): the former is rare but possible in some southern German and Austrian dialects, the latter is the strict head-final modification, found in several German, Austrian and Dutch non-standard va- rieties (in this case, the surface construction can actually be infinitive-participle-auxiliary). Thus, the derivational order and the possibility of extraposition give rise to a limited va- riety of combinations that makes sense of the variation found among the dialects: recall that (i) we are dealing with an initial tree and two auxiliary trees, and (ii) that adjunction always inserts auxiliary trees into initial trees (at locations determined by the identity of root-frontier labels), which limits the number of superficial strings that can be obtained. In our discussion of the TAG approach to German, the reader may have noted that hätte (‘have’) was assigned its own elementary tree but in the Spanish discussion, perfec- tive auxiliary haber (‘have’) was part of the elementary tree anchored by a modal: the ques- tion arises, then, whether have is a lexical predicate in German. Recall that a TAG is lexi- calised if and only if each elementary tree is the projection of a single lexical head ([28]). As a matter of fact, this may be a point of cross-linguistic variation: auxiliaries may be lexicalised to different extents in different languages, and this impacts directly on the size of elementary trees for sequences of auxiliary verbs and verb complexes. The argument in this section illustrates the first point mentioned above: lexicalised TAGs allow us to define a picture of cross-linguistic variation that differs from the tradi- tional minimalist view (although it may be construed as complementary to it): note that if English modals are not lexical heads, it means that elementary trees for sequences of aux- iliaries in English will be, in general, bigger than elementary trees for sequences of auxil- iaries in Spanish: at most, a chain of auxiliaries in English houses four auxiliaries. (9) John must have been being helped (Modal + perfective + progressive + passive)

In a representation a la Frank, there would be two elementary trees for (9): one an- chored by John, and the other by helped. The limitations of English chains of auxiliaries are related to (i) the fact that modal verbs are defective, without non-finite forms, and (ii) predicates that in other languages behave as auxiliaries (such as start, finish, continue, at- tempt) in English fail the most fundamental tests for auxiliarihood (ability to host negation, inversion in sentences, etc. [35]). Note that the structural description as- signed to English auxiliary chains by an LTAG is strictly monotonic, and in principle, compatible with cartographic assumptions, at least configurationally. However, TAGs al- low us to capture aspects of cross-linguistic variation by not assuming a priori that all sequences of auxiliaries (or indeed all elementary trees) in all languages are created equal. This approach to variation may be thought of as complementary to the mainstream gen- erative one, whereby functional categories are identified as the locus of cross-linguistic differences: in an LTAG, there is variation to be found in the definition of elementary trees depending on whether a particular expression behaves as a lexical or a functional head in a given language. Thus, there are reasons to classify Spanish modal auxiliaries as “lexical”

Philosophies 2021, 6, 70 16 of 29

Adjoining the two auxiliary trees to the same initial tree in different orders, we obtain the strings schreiben hätte können (V3-V1-V2) and schreiben können hätte (V3-V2-V1): the former is rare but possible in some southern German and Austrian dialects, the latter is the strict head-final modification, found in several German, Austrian and Dutch non-standard varieties (in this case, the surface construction can actually be infinitive-participle-auxiliary). Thus, the derivational order and the possibility of extraposition give rise to a limited variety of combinations that makes sense of the variation found among the dialects: recall that (i) we are dealing with an initial tree and two auxiliary trees, and (ii) that adjunction always inserts auxiliary trees into initial trees (at locations determined by the identity of root-frontier labels), which limits the number of superficial strings that can be obtained. In our discussion of the TAG approach to German, the reader may have noted that hätte (‘have’) was assigned its own elementary tree but in the Spanish discussion, perfective auxiliary haber (‘have’) was part of the elementary tree anchored by a modal: the question arises, then, whether have is a lexical predicate in German. Recall that a TAG is lexicalised if and only if each elementary tree is the projection of a single lexical head ([28]). As a matter of fact, this may be a point of cross-linguistic variation: auxiliaries may be lexicalised to different extents in different languages, and this impacts directly on the size of elementary trees for sequences of auxiliary verbs and verb complexes. The argument in this section illustrates the first point mentioned above: lexicalised TAGs allow us to define a picture of cross-linguistic variation that differs from the tradi- tional minimalist view (although it may be construed as complementary to it): note that if English modals are not lexical heads, it means that elementary trees for sequences of auxiliaries in English will be, in general, bigger than elementary trees for sequences of auxiliaries in Spanish: at most, a chain of auxiliaries in English houses four auxiliaries.

“John must have been being helped” (Modal + perfective + progressive + passive) (9)

In a representation a la Frank, there would be two elementary trees for (9): one anchored by John, and the other by helped. The limitations of English chains of auxiliaries are related to (i) the fact that modal verbs are defective, without non-finite forms, and (ii) predicates that in other languages behave as auxiliaries (such as start, finish, continue, attempt) in English fail the most fundamental tests for auxiliarihood (ability to host negation, inversion in interrogative sentences, etc. [35]). Note that the structural description assigned to English auxiliary chains by an LTAG is strictly monotonic, and in principle, compatible with cartographic assumptions, at least configurationally. However, TAGs allow us to capture aspects of cross-linguistic variation by not assuming a priori that all sequences of auxiliaries (or indeed all elementary trees) in all languages are created equal. This approach to variation may be thought of as complementary to the mainstream generative one, whereby functional categories are identified as the locus of cross-linguistic differences: in an LTAG, there is variation to be found in the definition of elementary trees depending on whether a particular expression behaves as a lexical or a functional head in a given language. Thus, there are reasons to classify Spanish modal auxiliaries as “lexical” in the relevant sense (they can modify and be modified), but not English modals. Frank [21], p. 238, formulates the issue in the following terms: “The set of elementary trees in the grammar of a language is the only locus of cross-linguistic differences, imposing substantial restrictions on the kind of variation that we will find crosslinguistically”. Our earlier observation about the relative size of elementary trees and the length of chains can be somewhat generalised. If a language’s auxiliary verbs (in particular, auxiliaries that take infinitival complements) are lexical predicates, then sequences of auxiliaries may be longer, but their component elementary trees will be smaller. This observation is related to Menzarath–Altmann’s law (MAL; [46]), although we must note that the MAL makes no reference to hierarchical relation; rather, it is based on quantitative properties of textual units. However, the inverse relationship between the size of composite units and the size Philosophies 2021, 6, 70 17 of 29

of elementary units (such that the bigger the composite unit, the smaller its component parts) seems to emerge from the comparison of English and Spanish auxiliary chains.

4. Locality and the Definition of Dependencies The previous section dealt with the first of the issues we identified: the definition of elementary trees and a revision of the CETM. The second issue, how to express non- local dependencies in terms of local relations, is perhaps where the novel aspects of our contribution become more explicit. We need to define what the nodes in the elementary trees stand for. In other words: structural descriptions specify relationships between objects; once we have defined the allowed configurations in which those objects may occur (elementary and derived trees), we can look closer at the atomic objects themselves. Let us briefly consider some difficulties that arise with the traditional minimalist view that syntactic terminals are sets of features (as in [39,47,48]; see [49], p. 266, for a version of this critical argument). Suppose that we have a syntactic object α, which is defined by a set of features, some of which are interpretable and some of which are uninterpretable (whether this uninterpretability is related to valuation or not is orthogonal to the present discussion). We thus define α as follows (where i = interpretable, u = uninterpretable):

α = {iF1, iF2, uF3, uF4} (10)

Now suppose that a suitable head H that can valuate/check the uninterpretable features of α has been introduced in the derivation. Minimalism [7,12] takes this to be the kind of situation that motivates the property of displacement, and therefore α must be copied and internally merged in a position where H and the copy of α are in a suitable relationship for the elimination of uninterpretable features (indicated with strikethrough in (11) below). Otherwise, the syntactic object containing α will collapse at the interfaces. The problem is the following: how are α and its copy related? If the internal merge of α’s copy has succeeded in creating a suitable checking configuration, then the featural definition of this copy will be: αCopy = {iF1, iF2, uF3, uF4} (11) However, clearly, α and its copy are not defined by identical feature sets. This means that if operations are indeed driven by the necessity to evaluate/check features [6], no two copies of a syntactic object are defined by the same feature specifica- tion. One may say that if we have a chain, then having any featural change in one link “automatically” affects all other links. That has several problems. One is that operations in standard formal systems (other than cellular automata) cannot take place simultaneously. Therefore, one link of the chain will necessarily be affected first. Then, the problem is how to determine what else is a link in that chain, given that one of the links now looks different. It can be circumvented with indices, but the problem remains. Another is that for this to work, we would need all links to be accessible: this goes against locality theory, in particu- lar, the idea that some syntactic domains (“phases”), when completed, get transferred to the performance systems. If there is a copy in a transferred domain, it cannot be tampered with. Concretely, in what did John buy, the copy of what is in the complement position of buy is not accessible by the time we check the wh-feature in Spec-CP, since the VP will have been transferred. Identity is not possible in a system that admits feature checking relations since the feature matrix of an element varies as the derivation unfolds and copies are internally merged in places in which they can evaluate and erase uninterpretable features. Any link between α and its copy should be encoded as a diacritic of sorts (e.g., a referential index that is carried throughout the derivation). An alternative could tamper with the “foot” of a chain, which may or may not be accessible. A similar argument about the problems of assuming copy-creation mechanisms in a feature-based computational system is presented Philosophies 2021, 6, 70 18 of 29

in [50], whereby taking the chain system in [39] seriously leads to a multiplication of entities in chains and the unwanted consequence of every chain having at least one member with unvalued/unchecked features: how that chain can receive an interpretation is a problem still unsolved, particularly given the strong set-theoretic commitments of more recent minimalist proposals. Some developments (e.g., [40,51]) work around a distinction between copies and repetitions, whereby a copy (produced by Internal Merge) would occupy two positions at the same time: as convincingly argued in [52], this is incompatible with the set-theoretic foundations of structure building. Requiring identity in movement chains under the copy theory of movement is not tenable under standard definitions of identity (for reasons related to set-membership as well as featural composition) was argued in [50]. There are several alternatives to this view, most of which entail abandoning the so- called single mother condition [53]: the requirement that every node in a tree have a single mother node (i.e., be dominated by only one node, distinct from itself). These include parallel merge [54] and several versions of multidominance ([50,55–57]). In classical TAG ([18,58]), filler-gap dependencies are captured by means of so-called links. Links in TAGs are relations between two nodes A and B, such that A c-commands B in an elementary tree and B dominates a null string. Links are crucially maintained after adjunction: they cannot be disrupted or created by means of tree-composition operations. In Joshi’s introduction of links [18], p. 214 no mention is made to issues of reference, but since links do not affect the weak generative capacity of grammar, the same results can be obtained with a TAG where all links have been removed [18], p. 217. This device is not exclusive to (or original to) TAGs: Peters and Ritchie [59] define the so-called phrase linking grammars, where links are used to relate displaced constituents to their “base generation” position in directed acyclic graphs. In classical TAGs, links relate operators to variables and remain unaffected by the adjunction of recursive structure: as such, filler-gap dependencies are always defined in elementary trees. The main difference between a TAG and a minimalist approach to filler-gap depen- dencies is that there is no inter-clausal movement in TAGs (lexicalised or not). We may compare the derivation of a wh-interrogative such as what do you think the senator should read? in TAG terms and minimalist terms to illustrate this point. First, TAGs. In a lexicalised system a la Frank, we have three elementary trees: Philosophies 2021, 6, x FOR PEER REVIEWi. The extended projection of the lexical verb read. 21 of 33

ii. The extended projection of the lexical verb think. iii. The extended projection of the noun senator. LetLet us diagram us diagram the derivation the derivation in Figure in 18 Figure, which 18 involves, which both involves substitution both substitution and ad- junction:and adjunction:

Figure 18. Derivation with Substitution and Adjunction.

Figure 18. Derivation with Substitution and Adjunction

The final derived tree would look like the one in Figure 19:

PhilosophiesPhilosophies 20212021, 6,, 6x, FOR 70 PEER REVIEW 22 19of of33 29

The final derived tree would look like the one in Figure 19:

Figure 19. Derived tree for ‘what do you think the senator should read?’ Figure 19. Derived tree for ‘what do you think the senator should read?’.

It Itis iscrucial crucial to to note note that that in in Figure Figure 19 19 there there is is no no actual actual movement: movement: the the relation relationshipship betweenbetween the the empty empty string string in in the the complement complement position position of of V Vreadread andand the the whwh-operator-operator in in SpecSpec-CP-CP was was defined defined at at the the level level of of the the elementary elementary tree tree of of readread, with, with both both the the operator operator and and thethe variable variable being, being, as as it itwere, were, “base “base-generated”.-generated”. The The adjunction adjunction of of the the think think elementary elementary treetree between between the the whwh-operator-operator and and the the rest rest of of the the read read elementary elementary tree tree just just simulates simulates the the effecteffect of of movement: movement: as as we we will highlighthighlight below,below, the the relation relation between betweenread readand andwhat whatremains re- mainsundisturbed. undisturbed. Note also Note that also there that is thereno need is to no propose need to “intermediate propose “intermediate landing sites”, landing which sitesis consistent”, which is with consistent the characterisation with the characterisation of wh-movement of wh as-movement an unbounded as an rule unbounded(U-rule) in rule [60 ], (Uwhose-rule) in application [61], whose takes application place in onetakes fell place swoop. in one5 fell swoop.5 In minimalist terms, the structure would be built from the bottom-up, adding one node5 Importantly, at a time, and this the does relation not betweenmean that the TAGs operator are andincompatible the variable with would successive be established cyclic movement.only after movement.If there are occurrences of a wh in intermediate positions, then we need to assumeThis that gives there usis a a dependency segue to the to comparison be defined within with a an transformational adjoined elementary approach tree. Eng- to dis- lishplacement. does not The seem minimalist to provide approach evidence is based of intermediate on the idea landing that constituent sites in cases movement of un- is motivated by the need to eliminate uninterpretable features from syntactic representations. bounded wh-movement, but other languages do (e.g., by spelling-out copies or dummy The property of displacement is, according to [6,12], an “imperfection” of the faculty of lan- wh-pronouns in intermediate locations). What this system allows us to dispense with are guage, together with the existence of uninterpretable features. These two “imperfections” tokens of a syntactic object required only by formal reasons, without there being any se- cancel each other out, so to speak, if displacement allows the computational system to mantic or phonological effect (this may, of course, vary across languages as it is part of eliminate uninterpretable features from phrase markers. Crucially for our purposes, how- the definition of elementary trees). Frank [21], p. 237, allows objects to move to the edge ever, the implementation of displacement in transformational generative grammar requires of an elementary tree (his system allows for singulary transformations) and then, from

Philosophies 2021, 6, 70 20 of 29

a multiplication of entities: in the copy theory of movement, the displaced constituent gets copied and re-introduced (re-merged) into the structure. Furthermore, the theory of locality in minimalism, based on designated nodes in the structure (phase heads), requires additional copies of displaced constituents in an “escape hatch” position (an additional specifier) in each phase (a very similar situation to what happened in the Barriers model, as noted in [61]). Assuming that C and v * are phase heads, a long-distance wh-interrogative

Philosophies 2021, 6, x FOR PEER REVIEW requires (for intra-theoretical reasons) a copy of the displaced object in each24 of Spec-C 33 and Spec-v * position in between the position where the wh-element was first merged and the final Spec-CP position, where it can check its [uWh] feature as in Figure 20:

Figure 20. Minimalist derivation for ‘What do you think that the senator should read?’.

Figure 20. MinimalistOne of derivation the fundamental for ‘What do questions you think whenthat the analysing senator should filler-gap read?’ dependencies is whether movement is necessary at all. Non-transformational generative theories of displacement One(e.g., of LFG,the fundamental HPSG) have questionsbeen very successfulwhen analysing since the filler late-gap 1970s, dependencies and some versions is of whetherTAGs movement [18,28 ,58is ]necessary are indeed at non-transformational all. Non-transformational in the sensegenerative that they theories do not of include dis- rules placementfor (e.g., singulary LFG, transformations HPSG) have been (such very as wh successful-movement), since but the only late generalised 1970s, and transformations some versions of TAGs [18,59,28] are indeed non-transformational in the sense that they do not include rules for singulary transformations (such as wh-movement), but only generalised transformations (specifically, substitution and adjunction) and (in some cases) the concept of links. This is not to say that TAGs are inherently incompatible with singulary transfor- mations; as a matter of fact, the version of TAG explored in [19,21] is closer to transforma- tional generative models than to non-transformational models (read: “without singulary transformations”). The TAG formalism specifies how elementary trees are put together (and, if each elementary tree receives a local semantic interpretation, then it also defines a theory of compositionality), but it remains agnostic about the operations that apply within elementary trees; these are subject to independent empirical inquiry.

Philosophies 2021, 6, 70 21 of 29

(specifically, substitution and adjunction) and (in some cases) the concept of links. This is not to say that TAGs are inherently incompatible with singulary transformations; as a matter of fact, the version of TAG explored in [19,21] is closer to transformational generative models than to non-transformational models (read: “without singulary transformations”). The TAG formalism specifies how elementary trees are put together (and, if each elementary tree receives a local semantic interpretation, then it also defines a theory of compositionality), but it remains agnostic about the operations that apply within elementary trees; these are subject to independent empirical inquiry. As part of our approach to the mechanisms by means of which non-local dependencies are reduced to local ones, we will briefly explore an alternative that dispenses with a rule of movement in the derivation of wh-dependencies (see also [32] for an application of this model to gerund fronting and the position of subjects in Spanish). Our proposal makes use of a formal tool that is not new to TAG theorising: we define nodes in elementary trees as addresses to locations in the lexicon. In other words, elementary trees do not contain lexical tokens but addresses to lexical types. This is based on an idea by [62], which they implement in the context of a discussion about coordination in TAGs: Each node in the tree has a unique address obtained by applying a Gorn tree addressing scheme. This device is implemented in an account of gapping and right node raising. We may note that it is not unlike multidominance in terms of the tree language it defines, but it actually spec- ifies how it is implemented computationally. We are extending this addressing mechanism not only for coordination schemata but also for relations within elementary trees. Each time a terminal is introduced in an elementary tree, what is actually manipulated is an address for that terminal. This is important since an address can be called for several times in the execution of a program, in our particular case, structure building and compositional interpretation. All we need the grammar to do is recognise nodes that are assigned identical addresses. This is not requiring the computational system to do anything other than what it already does: in order for substitution or adjunction to work, identically labelled nodes in distinct elementary trees need to be identified as one. What we suggest is that this process can affect not only root-frontier nodes but also lexical nodes inside elementary trees. This entails that we can simplify the structure in Figure 19 since it contains two calls for the same address what. This simplification is one of the main points of the present paper. There are two ways to model this: one, topological; the other, graph-theoretical. In both cases, the formal object being defined is the same, and crucially, the structural relations between elements in the structural description are the same. The topological view entails defining elementary trees in a workspace understood as a topological space; this space can fold and self-intersect (this proposal has parallels in the field of DNA topology). The points of self-intersection are the objects that appear in more than one structural context: the syntactic workspace folds to identify different tokens of the same type, yielding a higher-dimensional manifold (a 2-D space folds creating a 3-D manifold), where in a wh-interrogative like what do you think the senator should read? what in Spec-C and what in Compl-V have a zero distance. This is only possible in a metric topological space if what in Spec-C and what in Compl-V are identical (as per the so-called identity property of metric spaces, whereby d(x, y) = 0 iff x = y; see [63,64]) as desired (this approach to movement is developed in [65]). Here we will briefly pursue a different view: the graph-theoretic account (presented in detail in [66]). The general tenets of this approach are already familiar to generative linguists, as per multidominance theories: the core idea is that instead of constructing un- ordered sets (a decision that forces the multiplication of nodes in instances of displacement and more generally yields problems of distinguishing copies and repetitions; see [40,51]), the grammar generates local graphs. However, the justification for structural descriptions that do not follow the single mother condition [53] in the present work is not related to properties of merge (as it is in Citko’s or Johnson’s work) but follows from the assumption Philosophies 2021, 6, 70 22 of 29

that what the syntax operates over are nodes, which are assigned addresses, and that these addresses are unique identifiers in a derivation. 6 PhilosophiesFurthermore, 2021, 6, x andFOR PEER along REVIEW the lines of [67], only substantive symbols are assigned 23 of 31 addresses: in terms of a lexicalised grammar, we implement this as establishing that only (single-word or multi-word) expressions assigned to an indexed category of the grammar and have a semantic value (categorematic elements in the logical tradition, see [68,69] for an application of this concept tocopulative categorial grammars)be have been are proposed assigned to addresses. be syncateg Theorematic set of [69], as have Spanish apparent syncategorematic symbols dependsprepositions on the language: and complementisers for example, English in verbal infinitival periphrasesto and (e.g., que in , be a in syncategorematic [)69 as], argued as have in Spanish [70]. apparent prepositions and complementisers in verbalThis means periphrases that “ (e.g.,…signsque thatin < tenersignify que nothing+ infinitive>, by themselves but serve to indicate a in ) as arguedhow in [ 70independently]. meaningful terms are combined” [68], i.e., syncategorematic expressions, This means that “ ... signs thatare signify not assigned nothing addresses,by themselves only but categorematic serve to indicate expressions how are (a similar perspective is independently meaningful termsadopted are combined” in [67], p. [68 214], i.e.,, withinsyncategorematic a purely mathematical expressions, areframework: in that work, only “ob- not assigned addresses, only categorematicject characters expressions” in formulae are (a similar are assigned perspective addresses). is adopted in [67], p. 214, within a purely mathematicalWhat is the content framework: of these in addresses? that work, A only possibility, “object explored in depth in [66], is that characters” in formulae are assignedthe content addresses). of an address is the intensional logic translation of the corresponding expres- What is the content of thesesion addresses? ([71,72]). An A possibility, introduction explored to intensional in depth logic in [ 66(IL)], is(specifically, the approach in [73] that the content of an address isand the related intensional works) logic would translation greatly exceed of the the corresponding limits of this paper and is not strictly needed expression ([71,72]). An introductionfor our to purposes. intensional However, logic (IL) we (specifically, can introduce the some approach concepts that may clarify the issue of in [73] and related works) wouldaddresses, greatly exceed much the research limits pending. of this paper Let and⦃what is⦄ not be strictlythe uniquely identifying address of the needed for our purposes. However,expression we can what introduce. We can some look concepts at how to that define may the clarify IL translation the of what, that is, the content issue of addresses, much researchof pending. the address Let ⦃what⦄: belet theP be uniquely a first-order identifying monovalent address (intransitive) predicate (in catego- of the expression what. We can lookrial atterms, how an to IV). define Thus, the if IL we translation consider ofwhat,what for, that any is, sentence the containing what the content content of the address : letof P the be address a first-order ⦃what monovalent⦄ would be (intransitive): predicate (in categorial terms, an IV). Thus, if we consider what, for any sentence containing what the ⦃what⦄ = P’̂ ∀x P{x} (12) content of the address would be: (12) follows the proposal in [71], p. 19, and related works closely, in turn building on [73]. Karttunen= P’∀ proposesx P{x} that the IL translation of bare (12) wh-words, say who, is the same as the IL translation of an indefinite common NP, such as someone: there is an existentially (12) follows the proposal in [71], p. 19, and related works closely, in turn building quantified term that corresponds to a member of a set of entities, the wh-operator intro- on [73]. Karttunen proposes that the IL translation of bare wh-words, say who, is the same duced the choice function that selects an entity from that set based on whether it satisfies as the IL translation of an indefinite common NP, such as someone: there is an existentially the conditions imposed by the predicative structure in which that wh occurs (see also [74] quantified term that corresponds to a member of a set of entities, the wh-operator introduced for an extension of Karttunen’s system to lexically restricted wh-phrases). The denotation the choice function that selects an entity from that set based on whether it satisfies the of (12) is the set of propositions such that, for a 1-place predicate P, P(x) holds. In other conditions imposed by the predicative structure in which that wh occurs (see also [74] for words: the denotation of who runs? is the set of propositions such that run(x) is true. A lot an extension of Karttunen’s system to lexically restricted wh-phrases). The denotation of details aside, what matters to us is that the meaning of a bare wh-word, such as what or of (12) is the set of propositions such that, for a 1-place predicate P, P(x) holds. In other who, can be assimilated to lexically restricted wh-phrases: suppose that we have the inter- words: the denotation of who runs? is the set of propositions such that run(x) is true. A rogative sentence which discourse did the senator read. The semantic form of this question, lot of details aside, what matters to us is that the meaning of a bare wh-word, such as what or who, can be assimilatedin to Reinhart’s lexically restricted terms, wouldwh-phrases: be: suppose that we have the interrogative sentence which discourse did{P|( the∃⟨ senatorx, f⟩) (CH( readf). & The senator( semanticx) & formP =ˆ(x of read this f(discourse)) & true(P))} (13) question, in Reinhart’s terms, would be: The interpretation of which discourse in this context is “choose an entity that belongs {P|(∃〈x, f 〉) (CH(f ) & senator(to the setx) &of Pdiscourses =ˆ(x read (suchf (discourse)) that there & true(P))}is an x, x a senator, (13) that read that discourse)”; who, in the same context, means “choose an entity (such that…)”, with the added specification The interpretation of which discoursethat the entityin this be context animate is “choose and perhaps an entity even that also belongs human to (for inanimate entities, we have the set of discourses (such that therewhat is). an Thex, x “asuch senator, that… that” readclause that depends discourse)”; on thewho structure, in the where the wh-word or phrase same context, means “choose anappears. entity (such that ... )”, with the added specification that the entity be animate and perhaps evenLet also us go human back to (for the inanimate analysis of entities, the long we-distance have what wh).-dependency in what do you think The “such that ... ” clause dependsthat on the the senator structure should where read? the Wewh want-word to orcapturephrase the appears fact that,. informally, the what in Spec- Let us go back to the analysisCP and of the the long-distance what in Complwh--dependencyV correspond intowhat the same do you element. The TAG formalism, as think that the senator should read?revisedWe want in to[62 capture], provides the factus with that, a informally,way to do exactly the what this. in We want to capture the fact that Spec-CP and the what in Compl-Vthe correspondaddress ⦃what to the⦄ is samecalled element. for in two The contexts: TAG formalism, one in which it receives an operator inter- as revised in [62], provides us withpretation a way (introducing to do exactly a this.choice We function; want to captureas in [74 the]), and fact another in which it satisfies the that the address is calledvalency for in two of the contexts: monotransitive one in which verb itread receives. At the an same operator time, we want to capture the fact that there is only one syntactic element. Recall that, in the version of LTAG adopted in [32,33], arguments of a lexical head belong in the extended projection of that head. In this case, the TP and CP layers belong to the extended projection of read, and the nominal arguments the senator, and what also do. In the elementary tree defined by read, then, there are two contexts where the address of what is called, but not two distinct addresses. The result is that in the graph that defines the structural relations in the elementary tree of read, ⦃what⦄

Philosophies 2021, 6, x FOR PEER REVIEW 27 of 33

The interpretation of which discourse in this context is “choose an entity that belongs Philosophies 2021, 6, x FOR PEER toREVIEW the set of discourses (such that there is an x, x a senator, that read that discourse)”; who23 of, 31

in the same context, means “choose an entity (such that…)”, with the added specification that the entity be animate and perhaps even also human (for inanimate entities, we have whatcopulative). The “such be havethat… been” clause proposed depends to be on syncateg the structureorematic where [69] ,the as havewh-word Spanish or phrase apparent appears.prepositions and complementisers in verbal periphrases (e.g., que in ,us a ingo < backir a + toinfinitive the analysis>) as ofargued the long in [-70distance]. wh-dependency in what do you think that the senatorThis means should that read ?“ …signsWe want that to capturesignify nothingthe fact that,by themselves informally, but the servewhat into Specindi-cate Philosophies 2021, 6, 70 23 of 29 CP howand independentlythe what in Compl meaningful-V correspond terms are to combined the same” element. [68], i.e., syncategorematicThe TAG formalism expressions, as , revisedare notin [63] assigned, provides addresses, us with aonly way categorematic to do exactly expressionsthis. We want are to(a capture similar the perspective fact that is the adoptedaddress ⦃inwhat [67]⦄, isp. called214, within for in atwo purely contexts: mathematical one in which framework: it receives in antha operatort work, only inter- “ob- pretationject characters (introducing”interpretation in formulae a choice arefunction; (introducing assigned as addresses).in a [76]), choice and function; another as in in which [74]), andit satisfies another the in which it satisfies valency Whatof the ismonotransitive thethe content valency of verbthese of the read addresses? monotransitive. At the sameA possibility, time, verb weread explored want. At to the capture in same depth time,the in fact[66 we] t, hatis want that to capture the therethe is content only one of syntactic anfact address that element. there is the is onlyintensionalRecallone that,syntactic logicin the translation element.version of Recall ofLTAG the that,corresponding adopted in the in version [32,33 expres-], of LTAG adopted argumentssion ([71 of,72 a] ).lexical Anin [ 32introduction head,33], argumentsbelong to in intensional the of aextended lexical logic head projection (IL) belong (specifically, of in that the extendedhead. the approach In this projection case, in [ 73 of] that head. In the andTP and related CP layers works)this belong case,would the to greatly the TP extended and exceed CP layers projection the limits belong of to thisread the ,paper and extended the and nominal is projection not strictly arguments of neededread, and the nominal the forsenator our, purposes.and whatarguments However,also do.the In we senatorthe can elementary ,introduce and what tree alsosome defined do. concepts In the byelementary thatread, maythen, clarify there tree defined tarehe issuetwo by ofread , then, there contextsaddresses, where much theare address research two contexts of pending. what where is called, Let the ⦃what but address ⦄not be twothe of what uniquelydistinctis called, addresses. identifying but not The address two result distinct of is the addresses. The thatexpression in the graph what thatresult. We defines can is that look the in atstructural the how graph to define thatrelati definesons the inIL translation the elementary structural of what relations tree, thatof read in is, the ,the ⦃what elementary content⦄ tree of read, (which,of the recall, address is the⦃what address⦄: (which,let P assigned be recall,a first -to isorder thethe addressmonovalentbasic expression assigned (intransitive) towhat the) is basic calledpredicate expression twice; (in catego-thiswhat ) is called twice; mayrial be terms, represented an IV).this Thus, using may beif multidominance, we represented consider what, using with for multidominance, any the sentence terminal containing node with thewhat terminalwhat having the nodecontent two what having two motherof the nodes. address We mother⦃ whatcan diagram⦄ would nodes. thebe We: elementa can diagramry tree the anchored elementary by read tree anchoredas in Figure by 21read: as in Figure 21:

⦃what⦄ = P’̂ ∀x P{x} (12)

(12) follows the proposal in [71], p. 19, and related works closely, in turn building on [73]. Karttunen proposes that the IL translation of bare wh-words, say who, is the same as the IL translation of an indefinite common NP, such as someone: there is an existentially quantified term that corresponds to a member of a set of entities, the wh-operator intro- duced the choice function that selects an entity from that set based on whether it satisfies the conditions imposed by the predicative structure in which that wh occurs (see also [74] for an extension of Karttunen’s system to lexically restricted wh-phrases). The denotation

of (12) is the set of propositions such that, for a 1-place predicate P, P(x) holds. In other words: the denotation of who runs? is the set of propositions such that run(x) is true. A lot of details aside, what matters to us is that the meaning of a bare wh-word, such as what or who, can be assimilated to lexically restricted wh-phrases: suppose that we have the inter- rogative sentence which discourse did the senator read. The semantic form of this question, in Reinhart’s terms, would be:

{P|(∃⟨x, f⟩) (CH(f) & senator(x) & P =ˆ(x read f(discourse)) & true(P))} (13) The interpretation of which discourse in this context is “choose an entity that belongs to the set of discourses (such that there is an x, x a senator, that read that discourse)”; who, in the same context, means “choose an entity (such that…)”, with the added specification

that the entity Figurebe animate 21. Multidominance and perhaps even structure also for human wh-movement. (for inanimate entities, we have Figurewhat 21).. MultidominanceThe “such that… structure” clause for wh depends-movement on the structure where the wh-word or phrase appears. The adjunction of the elementary tree of think, which includes the nominal argument TheLet adjunction us go backyou of, doestothe the elementary not analysis disrupt of tree thethe of structurallong think-distance, which relations whincludes-dependency established the nominal in at what the argument elementary do you think tree of read: the youthat, does the not senator disrupt edgeshould the connecting read?structural We want nodes relations to C capture and established the factand at that,the the elementary edgeinformally, connecting tree the ofwhat nodes read in: Vthe Spec and - remain edgeCP connecting and the what nodesuntouched, in C Compl and ⦃ aswhat-V seen correspond⦄ and in Figurethe edge to 22 the connectingbelow same (the element. nodes adjoined VThe and auxiliaryTAG ⦃what formalism⦄ treeremain has, as been bolded for untouched,revised in as [ 62seen],clarity). provides in Figure us 22with below a way (the to doadjoined exactly auxiliary this. We treewant has to capturebeen bolded the fact for that clarity)the address. ⦃what⦄ Inis called the topological for in two interpretation,contexts: one in wewhich would it receives say that an theoperator distances inter- between nodes pretationIn the topological (introducingremain interpretation, constant a choice afterfunction; we adjunction. would as in say [74 Thisthat]), and isthe important anotherdistances in since, between which for it anynodes satisfies measure re- the of distance in mainvalency constant of the after monotransitivebinary adjunction. branching This verb trees isread important (number. At the same ofsince nodes, time,, for numberweany want measure ofto edges,capture of distance etc.), the fact minimalism in that is forced binarythere branching is only one totrees syntactic assume (number thatelement. of the nodes, distance Recall number that, between inof theedges, filler version etc.), and of gapm LTAGinimalism is different adopted is forced in inWhat [32 to,33 should], the senator arguments of aread? lexical, What hea dod belong you think in the senatorextended should projection read?, and of thatWhat head. do you In thinkthis case, that Mary said that the TP and CP layersthe senator belong should to the read? extended; a consequence projection of of this read assumption, and the nominal is that arguments intermediate copies are the senator, andmultiplied what also (sincedo. In the the theory elementary needs tree to address defined the by tension read, then, between there locality are two in dependencies contexts whereand the increasingaddress of distanceswhat is called, between but not fillers two and distinct gaps). addresses. Preserving The structural result is relations after that in the graphadjunction that defines is one the ofstructural the crucial relations features in ofthe TAGs, elementary since alltree relations of read, are⦃what established⦄ at the level of elementary trees.

Philosophies 2021, 6, x FOR PEER REVIEW 28 of 33

assume that the distance between filler and gap is different in What should the senator read?, What do you think the senator should read?, and What do you think that Mary said that the senator should read?; a consequence of this assumption is that intermediate copies are multiplied (since the theory needs to address the tension between locality in dependencies and in- creasing distances between fillers and gaps). Preserving structural relations after adjunc- Philosophies 2021, 6, 70 tion is one of the crucial features of TAGs, since all relations are established at the level of24 of 29 elementary trees.

Figure 22. Derived tree for wh-interrogative featuring multidominance.

In traditional TAG terms [18], links are preserved under adjunction. However, whereas Figure the22. Derived definition tree for of wh links-interrogative in a TAG featuring requires multidominance a multiplication of nodes such that a link is a relation between two nodes, one of which dominates an empty string, the system explored Inhere traditional satisfies the TAG fundamental terms [18], TAG links hypothesis are preserved in a more under economic adjunction. way. However, whereas theThe definition main theoretical of links in distinctiona TAG requires between a multiplication the system of presented nodes such here, that and a link the mini- is a relationmalist between approach two to displacementnodes, one of iswhich that ourdominates structural an descriptionsempty string, are the graphs system connecting ex- ploredaddresses here satisfies (which the correspondfundamental to TAGtypes ),hypothesis whereas minimalist in a more economic trees contain way. lexical tokens. An interesting consequence of this approach is that, because the same address can be called several times, there is no multiplication of nodes in instances of displacement: instead of defining chains of distinct terms, we define a walk on a graph: an ordered sequence of nodes and edges. This means that displacement rules will not replace or destroy existing syntactic relations but only create new ones on top of already existing relations: structural relations are preserved throughout the derivation. For example: consider the phrase marker corresponding to the string John has read which book. Here, which book receives theta-role and Case in a position internal to the VP; under standard assumptions, the direct object is Philosophies 2021, 6, 70 25 of 29

a sister to the V. The question that arises here is crucial: in order to obtain which book has John read?, do we need to disrupt the relation between read and which book? Under the trace theory, we actually do: which book is copied, replaced by a trace, and re-introduced in the derivation at a later stage. In terms of the trace theory of movement [75] (pp. 44–45):

“The movement of NPi to position NPj (where A and B are the contents of these nodes) in (30) yields (31) as a derived constituent structure.

(30) . . . NPj . . . NPi ... | | AB

(31) . . . NPi . . . NPi ... | | B e

In this view, NPi and its contents are copied at position NPj, deleting NPj and A, and the identity element e is inserted as the contents of (in this case, the righthand) NPi, deleting B under identity.” Under standard minimalist assumptions, whereby reordering rules are reformulated in terms of Internal Merge, the steps for a copy + re-merge theory are the following [76,77]: “a. Copy one of two independently merged phrases. b. Spell out the lower copy as trace. c. Merge the trace. d. Merge the higher copy (possibly in a separate derivation)”. ([78], p. 61) “Independent operations of the Copy + Merge theory of movement: a. Copy. b. Merge. c. Form chain. d. Chain reduction”. ([79], p. 89)

Allegedly simpler systems have been proposed (e.g., [40,76] MERGE), but rarely implemented in the description of grammatical phenomena cross-linguistically. In both the trace and copy theories of movement, there are new elements introduced in the syntactic derivation and thus new relations. In the trace version, the structural relation between read and which book is disrupted and replaced by a relation between read and a trace co-indexed with which book. In a strict graph-theoretic interpretation of filler-gap dependencies, as implemented in our version of a lexicalised TAG, no substantive elements need to be added to repre- sentations or derivations: the work usually done by reordering transformations is done by adding edges in elementary trees. Edges in graph theory have formal status, unlike edges in diagrams of set theory-based syntax: the mechanisms to capture long-distance dependencies in our lexicalised proposal use only the tools made available to us by graph theory plus the proposal that nodes in structural representations are not lexical tokens but addresses to lexical types.

5. Conclusions Generative grammar strives to provide accounts of two properties of natural lan- guages: hierarchical structure and displacement. In contemporary minimalist analyses, both of these properties are handled by the same mechanism: Merge. Structure building, or “construal”, is the domain of External Merge (from the lexicon to the derivation), whereas structure mapping or “move” is the domain of Internal Merge (from the derivation to the derivation). However, there are two aspects that some, including ourselves, deem problem- atic: first, determining the size of local domains for the application of syntactic operations is left to designated nodes (so-called phase heads). This approach to locality is motivated by theory-internal conditions pertaining to Agree (such that phases are the domains where Philosophies 2021, 6, 70 26 of 29

feature checking and valuation operations take place), which have no semantic correlate. In this context, assigning compositional semantic interpretations to complex syntactic objects becomes an issue. Second, the way in which displacement is handled (the copy theory of movement) is based on strong set-theoretic commitments, which require the multiplication of nodes in reordering operations: for every element that moves, two nodes (at least) are created ([52] discusses some formal inconsistencies of the set-theoretic approach for the modelling of chain dependencies). We would like to put forth the idea that an LTAG perspective could provide construc- tive solutions to these two issues. In terms of construal, elementary trees are defined in terms of a notion whose definition is independently necessary for a theory of grammar: predication. Having syntactic domains be the extended projection of lexical predicates also allows us to address the issue of cross-linguistic variation from a novel perspective: as we have shown with evidence from English, Spanish and German, what counts as a lexical predicate (and thus what can anchor an elementary tree) is a locus of variation. The theory of variation is thus directly related to the theory of syntactic construal and the syntax- interface. In terms of displacement, we have proposed that, if syntactic terms in derivations are defined as addresses (that stand for the intensional logic translation of categorematic basic expressions of the language), it is possible to define structural descriptions as directed graphs with a total order where nodes can be visited more than once in a path. The result is a set of elementary graphs (which are technically not trees, since in graph theory, trees do not allow for loops) where all dependencies are established at the local level. Long-distance dependencies are the result of adjoining auxiliary graphs after all local dependencies have been defined within an elementary graph. In this context, the role that in generative grammar is played by reordering transformations is actually reduced to construal + graph composition operations. Because relations in elementary graphs are preserved under adjunction, “transformations” are homomorphic mappings: they preserve relations within a selected structure and do not create new nodes or relations. Creating new edges and maximising connectivity within elementary graphs as opposed to creating new nodes (copies), which is made available to us after giving up strict set-theoretic commitments and allowing syntactic terms to be uniquely identifying addresses, has at least three advantages (see also [49]): • The inclusiveness condition, in its strict interpretation, is always respected. • If elementary trees are graphs defined in a workspace, then the workspace is never expanded ([40,76]). • Problems pertaining to the distinction between copies and repetitions do not arise: copies are formalised as multiple visits to the same node in a strictly ordered walk through a graph, whereas repetitions are instances of distinct nodes (with the same morpho-phonological exponent) in a walk through a graph. The approach proposed here satisfies the conditions for a minimalist theory of gram- mar: eliminate superfluous elements in representations as well as superfluous steps in derivations.

Author Contributions: Conceptualisation, D.G.K. and A.P.; Formal analysis, D.G.K. and A.P.; Investi- gation, D.G.K. and A.P. Although, the overall paper is the result of joint work, for the concerns of the Italian Academia, D.G.K. takes responsibility of Sections1, 3.1 and4, and the conclusions and A.P., of Sections2 and 3.2. All authors have read and agreed to the published version of the manuscript. Funding: No funding received. Acknowledgments: We would like to express our gratitude towards Peter Kosta for his invitation to participate in this Special Issue, and two anonymous Philosophies reviewers for their useful comments. The usual disclaimers apply Conflicts of Interest: The authors declare no conflict of interest. Philosophies 2021, 6, x FOR PEER REVIEW 28 of 31

Philosophies 2021, 6, 70 Acknowledgments: We would like to express our gratitude towards Prof. Peter Kosta for his27 ofinvi- 29 tation to participate in this Special Issue, and two anonymous Philosophies reviewers for their useful comments. The usual disclaimers apply

AbbreviationsConflicts of Interest: The authors declare no conflict of interest. TheAbbreviations following abbreviations are used in this manuscript: NPThe =following Noun Phrase; abbreviations VP = Verb are Phrase;used in CPthis = manuscript: Complementiser Phrase; PP = Prepositional Phrase; AdvP = Adverb Phrase; DP = ; ACC = Accusative Case; DAT = Dative Case; CONDNP = Noun = Conditional; Phrase; VP INF = =Verb Infinitive; Phrase; PART CP = Participle;Complementiser PERF = Phrase; Perfective PP aspect;= Prepositional AUX = Auxiliary Phrase; verb;AdvP SG = =Adverb Singular Phrase; Number; DP CL = Determiner = Clitic; FUT Phrase; = Future ACC tense; = Accusative PST = Past Case; Tense; DAT SUBJ = = Dative Subjunctive Case; COND = Conditional; INF = Infinitive; PART = Participle; PERF = Perfective aspect; AUX = Auxiliary mood; S = Sentence verb; SG = Singular Number; CL = Clitic; FUT = Future tense; PST = Past Tense; SUBJ = Subjunctive mood; S = Sentence Notes Notes 1 As a matter of fact, [19] points out that elementary trees need not coincide with phases, as they can be larger than a phase; 1 As a matter of fact, [19] points out that elementary trees need not coincide with phases, as they can be larger than a phase; for for instance, the extended projection of a verbal head might be as large as a CP [24]. Perhaps more importantly for this paper, instance, the extended projection of a verbal head might be as large as a CP [24]. Perhaps more importantly for this paper, elementary trees can be smaller than phases. elementary trees can be smaller than phases. 2 2 ThisThis system,system, asas itit isis basedbased onon twotwo binarybinary dimensions,dimensions, predictspredicts thethe existenceexistence ofof fourfour kindskinds ofof syntacticsyntactic categories:categories: i.i. LexicalLexical predicates predicates (V, A,(V, P, A, nominalisations, P, nominalisations, lexical lexical auxiliaries). auxiliaries). ii.ii. LexicalLexical non-predicates non-predicates (proper (proper N, non-nominalisedN, non-nominalised common common N). N). iii.iii. Non-lexicalNon-lexical predicates predicates (functional (functional auxiliaries, auxiliaries, quantifiers). quantifiers). iv.iv. Non-lexical,Non-lexical, non-predicates non-predicates (Complementisers). (Complementisers). 33 StrictlyStrictly speaking, we also want to definedefine aa unitunit wherewhere therethere isis nono syntacticsyntactic positionposition betweenbetween haha andand debido,debido, inin thethe lightlight ofof paradigmsparadigms suchsuch asas thethe followingfollowing [[7777,,8080]:]: • ¿Qué ha el senador debido leer? What has.3SG-AUX-PRF the senator must-PART read-INF? • ¿Qué ha debido leer el senador? What has.3SG-AUX-PRF must-PART read-INF the senator? “What has the senator had to read?” 44 AA reviewerreviewer objectsobjects thatthat restrictingrestricting scopescope relationsrelations toto elementaryelementary treestrees maymay bebe problematic,problematic, forfor example,example, inin NPINPI licensinglicensing inin multi-clausalmulti-clausal structures,structures, suchsuch as:as: (i)(i) I don’t believe that they ate anything. However,However, unlike unlike our our Spanish Spanish example, example, it is it not is not derived derived by byadjunction; adjunction; it is it de isrived derived by bysubstitution: substitution: (ii)(ii) [I [I don’t don’t believe believe [S]] [S]] [S[s they they ate ate anything]. anything]. TheThe resultingresulting structuralstructural descriptiondescription thusthus allowsallows forfor scopescope relationsrelations toto bebe read off a monotonic syntactic configurationconfiguration (regardless(regardless ofof whether whether NEG NEG originates originates in inthe the embedded embedded clause clause and andNEG-raises NEG-raises or not). or For not). the For determination the determination of scope of relations scope relations it is essential it is toesse considerntial to whether consider the whether tree composition the tree composition operation that operation has applied that has is substitution applied is(which substitution yields (which tail-recursive yields structures tail-recursive in Englishstructures clausal in English complementation) clausal complementation) or adjunction. or adjunction. 5 5 Importantly,Importantly, thisthis doesdoes notnot meanmean that TAGs are incompatible with successive cyclic cyclic movement. movement. IfIf therethere areare occurrencesoccurrences ofof aa whwh inin intermediateintermediate positions,positions, thenthen wewe needneed toto assumeassume thatthat therethere isis aa dependencydependency toto bebe defineddefined withinwithin anan adjoinedadjoined elementaryelementary tree.tree. EnglishEnglish doesdoes notnot seemseem toto provideprovide evidenceevidence ofof intermediateintermediate landinglanding sitessites inin casescases ofof unboundedunbounded wh-movement,wh-movement, butbut otherother languageslanguages dodo (e.g.,(e.g., byby spelling-outspelling-out copiescopies oror dummydummy whwh-pronouns-pronouns inin intermediateintermediate locations).locations). WhatWhat thisthis systemsystem allowsallows usus toto dispensedispense withwith areare tokenstokens ofof a syntactic object required onlyonly by formal reasons, withoutwithout therethere beingbeing anyany semanticsemantic oror phonologicalphonological effecteffect (this(this may,may, ofof course,course, varyvary acrossacross languageslanguages asas itit isis partpart ofof thethe definitiondefinition ofof elementaryelementary trees).trees). FrankFrank [[2121],], p.p. 237,237, allowsallows objectsobjects toto movemove toto the edge of an elementary tree (his system allows forfor singulary transformations)transformations) andand then,then, fromfrom there,there, itit maymay bebe further reordered byby operationsoperations atat thethe target of adjunction. Detailed discussion about aspects of successive cyclicity in LTAGs waswas providedprovided inin [[1919],], e.g.,e.g., scopescope reconstruction.reconstruction. 6 This proposal also has parallels in computer languages. In programming languages, such as Python, expressions are assigned 6 This proposal also has parallels in computer languages. In programming languages, such as Python, expressions are assigned addresses, which are accessed in the execution of a program by means of variables. It is even possible to access the memory addresses, which are accessed in the execution of a program by means of variables. It is even possible to access the memory address by means of a specific function, id(…). It is also possible to create local variables, which get created every time a function address by means of a specific function, id( ... ). It is also possible to create local variables, which get created every time a is called and erased when the function returns an output; this kind of operation would be analogous to the procedure to create function is called and erased when the function returns an output; this kind of operation would be analogous to the procedure copies in movement operations (since the copy is created and stored only temporarily, to be merged later in order to satisfy to create copies in movement operations (since the copy is created and stored only temporarily, to be merged later in order to some featural requirement). In this case, we need two “spaces”: the “shell” (where the program is executed) and the list of satisfy some featural requirement). In this case, we need two “spaces”: the “shell” (where the program is executed) and the list of objects in memory (which contains the address, type and value assigned to each variable). objects in memory (which contains the address, type and value assigned to each variable).

ReferencesReferences 1. Bach, E. An Introduction to Transformational Grammars; Holt, Rinehart & Winston: New York, NY, USA, 1964. 1. Bach, E. An Introduction to Transformational Grammars; Holt, Rinehart & Winston: New York, NY, USA, 1964. 2. Ross, J.R. Constraints on Variables in Syntax. Ph.D. Thesis, MIT, Cambridge, MA, USA, 1967. 3. Chomsky, N.; Halle, M. The Sound Pattern of English; Harper & Row: New York, NY, USA, 1968. 4. Chomsky, N. Barriers; MIT Press: Cambridge, MA, USA, 1986. Philosophies 2021, 6, 70 28 of 29

5. Larson, R. On the double object construction. Linguist. Inq. 1988, 19, 335–391. 6. Pollock, J.-Y. Verb Movement, Universal Grammar, and the Structure of IP. Linguist. Inq. 1989, 20, 365–424. 7. Chomsky, N. Minimalist Inquiries: The Framework. In Step by Step—Essays in Minimalist Syntax in Honor of Howard Lasnik; Martin, R., Michaels, D., Uriagereka, J., Eds.; MIT Press: Cambridge, MA, USA, 2000; pp. 89–155. 8. Cinque, G. Adverbs and Functional Heads: A Cross-Linguistic Perspective; OUP: Oxford, UK, 1999. 9. Cinque, G. ‘Restructuring’ and Functional Structure. In Restructuring and Functional Heads. The Cartography of Syntactic Structures; Cinque, G., Ed.; OUP: Oxford, UK, 2004; Volume 4, pp. 132–192. 10. Cinque, G.; Rizzi, L. The Cartography of Syntactic Structures. StIL. 2008, Volume 2. Available online: http://www.ciscl.unisi.it/ doc/doc_pub/cinque-rizzi2008-The_cartography_of_Syntactic_Structures.pdf (accessed on 13 June 2020). 11. Cinque, G.; Rizzi, L. Rizzi, Functional Categories and Syntactic Theory. Annu. Rev. Linguist. 2016, 2, 139–163. 12. Chomsky, N. Derivation by Phase. In Ken Hale: A Life in Language; Kenstowicz, M., Ed.; MIT Press: Cambridge, MA, USA, 2001; pp. 1–52. 13. Richards, M. Deriving the Edge: What’s in a Phase? Syntax 2011, 14, 74–95. [CrossRef] 14. Boeckx, C. Phases beyond explanatory adequacy. In Phases: Developing the Framework; Gallego, A., Ed.; Mouton de Gruyter: Berlin, Germany, 2012; pp. 45–66. 15. Chomsky, N. The Logical Structure of Linguistic Theory. Mimeographed, MIT. 1955. Available online: http://alpha-leonis.lids. mit.edu/wordpress/wp-content/uploads/2014/07/chomsky_LSLT55.pdf (accessed on 30 July 2018). 16. Chomsky, N. Transformational Analysis. Ph.D. Thesis, University of Pennsylvania, Philadelphia, PA, USA, 1955. 17. Joshi, A.; Levy, L.; Takahashi, M. Tree adjunct grammars. J. Comput. Syst. Sci. 1975, 10, 136–163. [CrossRef] 18. Joshi, A.K. Tree adjoining grammars: How much context-sensitivity is required to provide reasonable structural descriptions. In Natural Language Parsing; Dowty, D., Karttunen, L., Zwicky, A., Eds.; CUP: Cambridge, MA, USA, 1985; pp. 206–250. 19. Frank, R. Phrase Structure Composition and Syntactic Dependencies; MIT Press: Cambridge, MA, USA, 2002. 20. Stabler, E. The epicenter of linguistic behavior. In Language down the Garden Path; Sanz, M., Laka, I., Tanenhaus, M., Eds.; OUP: Oxford, UK, 2013; pp. 316–323. 21. Frank, R. Tree adjoining grammar. In The Cambridge Handbook of Generative Syntax; den Dikken, M., Ed.; CUP: Cambridge, MA, USA, 2013; pp. 226–261. 22. Frank, R. Phase theory and Tree Adjoining Grammar. Lingua 2006, 116, 145–202. [CrossRef] 23. Grimshaw, J. Locality and Extended Projection. In Lexical Specification and Insertion; Coopmans, P., Everaert, M., Grimshaw, J., Eds.; John Benjamins: Amsterdam, The Netherlands, 2000; pp. 115–133. 24. Abney, S.P. The English Noun Phrase in Its Sentential Aspect. Ph.D. Thesis, MIT, Cambridge, MA, USA, 1987. 25. Frank, R.; Kulick, S.; Vijay-Shanker, K. Monotonic c-command: A new perspective on tree adjoining grammars. Grammars 2002, 3, 151–173. [CrossRef] 26. Gazdar, G.; Klein, E.; Pullum, G.K.; Sag, I.A. Generalised Phrase Structure Grammar; Blackwell: Oxford, UK, 1985. 27. Bresnan, J. Lexical Functional Syntax, 1st ed.; Wiley Blackwell: Oxford, UK, 2001. 28. XTAG Group. A Lexicalized TAG for English; Technical Report; University of Pennsylvania: Pennsylvania, PA, USA, 2001; Available online: https://repository.upenn.edu/cgi/viewcontent.cgi?article=1020&context=ircs_reports (accessed on 13 June 2020). 29. Joshi, A.K.; Schabes, Y. Tree-Adjoining Grammars and Lexicalized Grammars. Tech. Rep. (CIS). 1991, p. 445. Available online: http://repository.upenn.edu/cis_reports/445 (accessed on 22 April 2019). 30. Hegarty, M. Deriving Clausal Structure in Tree Adjoining Grammar; University of Pennsylvania: Pennsylvania, PA, USA, 1993. 31. Rambow, O. Mobile Heads and Strict Lexicalization. Master’s Thesis, University of Pennsylvania, Pennsylvania, PA, USA, 1993. 32. Krivochen, D.G.; García Fernández, L. On the position of subjects in Spanish periphrases: Subjecthood left and right. Boreal. Int. J. Hisp. Linguist. 2019, 8, 1–33. [CrossRef] 33. Krivochen, D.G.; García Fernández, L. Variability in syntactic-semantic cycles: Evidence from auxiliary chains. In Interface-Driven Phenomena in Spanish: Essays in Honor of Javier Gutiérrez-Rexach; González-Rivera, M., Sessarego, S., Eds.; Routledge: London, UK, 2020; pp. 145–168. 34. Bravo, A.; García Fernández, L.; Krivochen, D.G. On Auxiliary Chains: Auxiliaries at the Syntax-Semantics Interface. Borealis 2015, 4, 71–101. [CrossRef] 35. Huddleston, R.; Pullum, G. The Cambridge Grammar of the English Language; CUP: Cambridge, MA, USA, 2002. 36. May, R. Logical Form: Its Structure and Derivation; MIT Press: Cambridge, MA, USA, 1985. 37. Fabb, N. Three squibs on auxiliaries. In MIT Working Papers in Linguistics, Volume 5. Papers in Grammatical Theory; Haïk, I.Y., Massam, D., Eds.; Department of Linguistics and Philosophy: Cambridge, MA, USA, 1983; pp. 104–120. 38. Epstein, S.D. Un-Principled Syntax and the Derivation of Syntactic Relations. In Working Minimalism; Epstein, S., Hornstein, N., Eds.; MIT Press: Cambridge, MA, USA, 1999; pp. 317–345. 39. Chomsky, N. The Minimalist Program; MIT Press: Cambridge, MA, USA, 1995. 40. Chomsky, N. The UCLA Lectures. University of Arizona, Tucson, AZ, USA. 2020. Available online: https://ling.auf.net/ lingbuzz/005485 (accessed on 11 July 2020). 41. Evers, A. The Transformational Cycle in Dutch and Germany; Indiana University Linguistics Club: Indiana, IN, USA, 1975. 42. Den Besten, H.; Edmonson, J. The verbal complex in continental West Germanic. In On the Formal Syntax of the Westgermania; Abraham, W., Ed.; John Benjamins: Amsterdam, The Netherlands, 1983; pp. 155–216. Philosophies 2021, 6, 70 29 of 29

43. Haegeman, L.; van Riemsdijk, H. Verb Projection Raising, Scope, and the Typology of Rules Affecting Verbs. Linguist. Inq. 1986, 17, 417–466. 44. Wurmbrand, S. Verb clusters, verb raising, and restructuring. In The Blackwell Companion to Syntax; Everaert, M., van Riemsdijk, H., Eds.; Blackwell: Oxford, UK, 2017. 45. Kroch, A.; Santorini, B. The derived constituent structure of the West-Germanic verb-raising construction. In Principles and Parameters in Comparative Grammar; Freidin, R., Ed.; MIT Press: Cambridge, MA, USA, 1991; pp. 269–338. 46. Altmann, G. Prolegomena to Menzerath’s law. Glottometrika 1980, 2, 1–10. 47. Adger, D.; Svenonius, P. Features in minimalist syntax. In The Oxford handbook of linguistic Minimalism; Boeckx, C., Ed.; OUP: Oxford, UK, 2011; pp. 27–51. 48. Panagiotidis, P. Towards a (Minimalist) Theory of Features. University of Cyprus, Nicosia, Cyprus. 2021. Available online: https://ling.auf.net/lingbuzz/005615 (accessed on 15 February 2021). 49. Krivochen, D.G. Copies and Tokens: Displacement Revisited. Studia Linguist. 2015, 70, 250–296. [CrossRef] 50. Gärtner, H.-M. Generalized Transformations and Beyond. Reflections on Minimalist Syntax; Akademie Verlag: Berlin, Germany, 2002. 51. Collins, C.; Groat, E. Distinguishing Copies and Repetitions. 2018. Available online: http://ling.auf.net/lingbuzz/003809 (accessed on 7 May 2021). 52. Gärtner, H.-M. Copies from ‘Standard Set Theory’. 2020. Available online: https://ling.auf.net/lingbuzz/005942 (accessed on 10 September 2020). 53. Sampson, G. The Single Mother Condition. J. Linguist. 1975, 11, 1–11. [CrossRef] 54. Citko, B. On the Nature of Merge: External Merge, Internal Merge, and Parallel Merge. Linguist. Inq. 2005, 36, 475–496. [CrossRef] 55. Levine, R.D. Right node (non-)raising. Linguist. Inq. 1985, 16, 492–497. 56. McCawley, J.D. Parentheticals and Discontinuous Constituent Structure. Linguist. Inq. 1982, 13, 91–106. 57. Johnson, K. Toward a Multidominant Theory of Movement. Lectures Presented at ACTL, University College. June 2016. Available online: https://people.umass.edu/kbj/homepage/Content/Multi_Movement.pdf (accessed on 16 August 2020). 58. Kroch, A.; Joshi, A.K. The Linguistic Relevance of Tree Adjoining Grammar; Technical report MS-CS-85-16; Department of Computer and Information Sciences, University of Pennsylvania: Philadelphia, PA, USA, 1985. 59. Peter, S.; Ritchie, R. Phrase-Linking Grammar. University of Texas at Austin: Austin, TX, USA, 1981. 60. Postal, P. On some rules that are not successive-cyclic. Linguist. Inq. 1972, 3, 211–222. 61. Boeckx, C.; Grohmann, K.K. Barriers and Phases: Forward to the Past? Presented in Tools in Linguistic Theory 2004 (TiLT) Budapest (16–18 May 2004). Available online: http://people.umass.edu/roeper/711-05/Boexce%20barriers+phases%20tilt_bg_ho.pdf (accessed on 26 April 2018). 62. Sarkar, A.; Joshi, A.K. Handling Coordination in a Tree Adjoining Grammar; Technical Report; University of Pennsylvania: Philadel- phia, PA, USA, 1997. 63. Sutherland, W. Introduction to Metric and Topological Spaces, 2nd ed.; OUP: Oxford, UK, 2009. 64. Searcóid, M. Metric Spaces; Springer: Berlin/Heidelberg, Germany, 2007. 65. Krivochen, D.G. Towards a Theory of Syntactic Workspaces: Neighbourhoods and Distances in a Lexicalised Grammar. 2021. Available online: https://ling.auf.net/lingbuzz/005689 (accessed on 6 May 2021). 66. Krivochen, D.G. Syntax on the Edge: A Graph-Theoretic Analysis of English (and Spanish) Sentence Structure. Under Review. 2018. Available online: https://ling.auf.net/lingbuzz/003842 (accessed on 6 May 2021). 67. Gorn, S. Handling the growth by definition of mechanical languages. In Proceedings of the Spring Joint Computer Conference, Atlantic City, NJ, USA, 18–20 April 1967; pp. 213–224. [CrossRef] 68. MacFarlane, J. Logical constants. In The Stanford Encyclopedia of Philosophy. 2017. Available online: http://plato.stanford.edu/ archives/fall2015/entries/logical-constants/ (accessed on 18 March 2021). 69. Schmerling, S.F. Sound and Grammar: Towards a Neo-Sapirian ; Brill: Leiden, The Netherlands, 2018. 70. García Fernández, L.; Krivochen, D.G.; Martín Gómez, F. Los elementos intermedios en las perífrasis verbales. Lingüística Española Actual 2021. 71. Karttunen, L. Syntax and semantics of questions. Linguist. Philos. 1977, 1, 3–44. [CrossRef] 72. Hamblin, C.L. Questions in Montague English. Found. Lang. 1973, 10, 41–53. 73. Montague, R. The proper treatment of quantification in ordinary English. In Approaches to Natural Language; Hintikka, J., Moravcsik, J., Suppes, P., Eds.; Reidel: Dordrecht, The Netherlands, 1973; pp. 221–242. 74. Reinhart, T. Wh-in situ in the framework of the minimalist program. Nat. Lang. Semant. 1998, 6, 29–56. [CrossRef] 75. Fiengo, R. On Trace Theory. Linguist. Inq. 1977, 8, 35–61. 76. Chomsky, N. Some Puzzling Foundational Issues: The Reading Program. Catalan J. Linguist. 2019, 263–285. [CrossRef] 77. Torrego, E. On Inversion in Spanish and Some of Its Effects. Linguist. Inq. 1984, 15, 103–129. 78. Uriagereka, J. Multiple Spell-Out. In Derivations: Exploring the Dynamics of Syntax; Routledge: London, UK, 2002; pp. 45–65. 79. Nunes, J. Linearization of Chains and Sidewards Movement; MIT Press: Cambridge, MA, USA, 2004. 80. García Fernández, L.; Krivochen, D.G. Dependencias no locales y cadenas de verbos auxiliares. Verba 2019, 46, 207–244. [CrossRef]