Catena Operations for Unified Dependency Analysis

Total Page:16

File Type:pdf, Size:1020Kb

Catena Operations for Unified Dependency Analysis Catena Operations for Unified Dependency Analysis Kiril Simov and Petya Osenova Linguistic Modeling Department Institute of Information and Communication Technologies Bulgarian Academy of Sciences Bulgaria {kivs,petya}@bultreebank.org Abstract to its syntactic analysis. On the other hand, a uni- fied strategy of dependency analysis is proposed The notion of catena was introduced origi- via extending the catena to handle also phenomena nally to represent the syntactic structure of as valency and other combinatorial dependencies. multiword expressions with idiosyncratic Thus, a two-fold analysis is achieved: handling the semantics and non-constituent structure. lexicon-grammar relation and arriving at a single Later on, several other phenomena (such means for analyzing related phenomena. as ellipsis, verbal complexes, etc.) were The paper is structured as follows: the next formalized as catenae. This naturally led section outlines some previous work on catenae; to the suggestion that a catena can be con- section 3 focuses on the formal definition of the sidered a basic unit of syntax. In this paper catena and of catena-based lexical entries; sec- we present a formalization of catenae and tion 4 presents different lexical entries that demon- the main operations over them for mod- strate the expressive power of the catena formal- elling the combinatorial potential of units ism; section 5 concludes the paper. in dependency grammar. 2 Previous Work on Catenae 1 Introduction The notion of catena (chain) was introduced in Catenae were introduced initially to handle lin- (O’Grady, 1998) as a mechanism for representing guistic expressions with non-constituent structure the syntactic structure of idioms. He shows that and idiosyncratic semantics. It was shown in a for this task there is need for a definition of syntac- number of publications that this unit is appropri- tic patterns that do not coincide with constituents. ate for both - the analysis of syntactic (for exam- He defines the catena in the following way: The ple, ellipsis, idioms) and morphological phenom- words A, B, and C (order irrelevant) form a chain ena (for example, compounds). One of the impor- if and only if A immediately dominates B and C, tant questions in NLP is how to establish a connec- or if and only if A immediately dominates B and B tion between the lexicon and the text dimension in immediately dominates C. an operable way. At the moment most investiga- In recent years the notion of catena revived tions focus on the representation and analysis of again and was applied also to dependency rep- the text dimension. resentations. Catenae have been used success- We first employed catenae when modeling mul- fully for the modelling of problematic language tiword expressions in Bulgarian within the relation phenomena. (Gross 2010) presents the problems lexicon - text. (Simov and Osenova, 2014). En- in syntax and morphology that have led to the couraged by the promising results, we continued introduction of the subconstituent catena level. our research on how to exploit catenae as a uni- Constituency-based analysis faces non-constituent fied strategy for dependency analysis. In the paper structures in ellipsis, idioms, verb complexes. we use examples mostly from Bulgarian and to a Apart from the linguistic modelling of language lesser extend from English, but our approach is ap- phenomena, catenae have been used in a number plicable to other languages, as well. of NLP applications. (Maxwell et al., 2013), for In this piece of research we pursue both issues example, presents an approach to Information Re- mentioned above. On the one hand, we show in a trieval based on catenae. The authors consider formal way how the lexicon representation maps the catena as a mechanism for semantic encoding 320 Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015), pages 320–329, Uppsala, Sweden, August 24–26 2015. root iobj con. conj In order to model the variety of phenomena subj det and characteristics encoded in a dependency gram- cc mar we extend the catena with partial arc and John bought and ate an apple node labels. We follow the approach taken in rootC rootC rootC CoNLL shared tasks on dependency parsing repre- conj senting for each node its word form, lemma, part det cc of speech, extended part of speech, grammatical John bought and ate an apple features (and later – semantics). This provides a flexible mechanism for expressing the combinato- Figure 1: A complete dependency tree and some rial potential of lexical items. In the following def- of its catenae. inition all grammatical features are represented as POS tags. Let us have the sets: LA — a set of POS tags1, which overcomes the problems of long-distance LE — a set of lemmas, WF — a set of word paths and elliptical sentences. The employment of forms, and a set of dependency tags D (ROOT catenae in NLP applications is additional motiva- ∈ D). Let us have a sentence x = w , ..., w .A tion for us to use it in the modelling of the interface 1 n is a directed tree T = between the treebank and the lexicon. tagged dependency tree (V, A, π, λ, ω, δ) where: Terminology note: an alternative term for catena is treelet. It has been used in the area of ma- 1. V = 0, 1, ..., n is an ordered set of nodes chine translation as a unit for translation transfer { } that corresponds to an enumeration of the (see (Quirk et al., 2005)). Their definition is equiv- words in the sentence (the root of the tree has alent to the definition of catena. Also (Kuhlmann, index 0); 2010) uses treelet for a node and its children (if any). In the paper we resort to the term catena 2. A V V is a set of arcs. For each node ⊆ × because it is closer to the spirit of the issues dis- i, 1 i n, there is exactly one arc in A: ≤ ≤ cussed here. i, j A, 0 j n, i = j. There is h i ∈ ≤ ≤ 6 exactly one arc i, 0 A; 3 Formal Definition of Catena h i ∈ In this section we define the formal presentation 3. π : V 0 LA is a total labelling func- − { } → 2 of the catena as it is used in syntax and in the tion from nodes to POS tags . π is not de- lexicon. Here we follow the definition of catena fined for the root; provided by (O’Grady, 1998) and (Gross, 2010): 4. λ : V 0 LE is a total labelling func- a catena is a word or a combination of words − { } → tion from nodes to lemmas. λ is not defined directly connected in the dominance dimension. for the root; In reality this definition of catena for dependency trees is equivalent to a subtree definition. Fig. 1 5. ω : V 0 WF is a total labelling func- depicts a complete dependency tree and some of − { } → tion from nodes to word forms. ω is not de- its catenae. Notice that the complete tree is also fined for the root; a catena itself. With “rootC ” we mark the root of the catena. It might be the same as the root of 6. δ : A D is a total labelling function for → the complete tree, but also might be different as arcs. Only the arc i, 0 is mapped to the label h i in the cases of “John” and “an apple”. Following ROOT ; (Osborne et al., 2012) we prefer to use the notion of catena to that of dependency subtree or treelet 7. 0 is the root of the tree. as mentioned above. We aim to utilize the notion 1In the formal definitions here we use tags as entities, but of catena for several purposes: representation of in practice they are sets of grammatical features words and multiword expressions in the lexicon, 2In case when we are interested in part of the grammatical their realization in the actual trees expressing the features encoded in a POS tag we could consider p as a set of different mappings for the different grammatical features. It analysis of sentences as well as for representation is easy to extend the definition in this respect, but we do not of derivational structure of compounds in the lexi- do this here. 321 We will hereafter refer to this structure as a arc labels. For example, the catena for “to spill the parse tree for the sentence x. The node 0 does beans” will allow for any realization of the verb not correspond to a word form in the sentence, but form like in: “they spilled the beans” and “he spills plays the role of a root of the tree. the beans”. Thus, the catena in the lexicon will Let T = (V, A, π, λ, ω, δ) be a tagged depen- be underspecified with respect to the grammatical dency tree. features and word form for the verb. Let T = (V, A, π, λ, ω, δ) be a tagged root dependency tree. A directed tree G = dobj (VG,AG, πG, λG, ωG, δG) is called dependency clitic catena of T if and only if there exists a mapping ψ : V V 3 such that: Vpi Pp Nc G → – си очите затварям си око 1. AG A, the set of arcs of G; ⊆ shut one’s eyes 2. π π is a partial labelling function from G ⊆ nodes of G to POS tags; Realization 1: 3. λ λ is a partial labelling function from G ⊆ rootC nodes to lemmas; dobj iobj pobj 4. ω ω is a partial labelling function from clitic G ⊆ nodes to word forms; Nc Pp Vpi R Nc Очите си затваряха пред фактите 5. δG δ is a partial labelling function for arcs.
Recommended publications
  • Interpreting and Defining Connections in Dependency Structures Sylvain Kahane
    Interpreting and defining connections in dependency structures Sylvain Kahane To cite this version: Sylvain Kahane. Interpreting and defining connections in dependency structures. 5th international conference on Dependency Linguistics (Depling), Aug 2019, Paris, France. hal-02291673 HAL Id: hal-02291673 https://hal.parisnanterre.fr//hal-02291673 Submitted on 19 Sep 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Interpreting and defining connections in dependency structures Sylvain Kahane Modyco, Université Paris Nanterre & CNRS [email protected] Abstract This paper highlights the advantages of not interpreting connections in a dependency tree as combina- tions between words but of interpreting them more broadly as sets of combinations between catenae. One of the most important outcomes is the possibility of associating a connection structure to any set of combinations assuming some well-formedness properties and of providing a new way to define de- pendency trees and other kinds of dependency structures, which are not trees but “bubble graphs”. The status of catenae of dependency trees as syntactic units is discussed. 1 Introduction The objective of this article is twofold: first, to show that dependencies in a dependency tree or a similar structure should not generally be interpreted only as combinations between words; second, to show that a broader interpreta- tion of connections has various advantages: It makes it easier to define the dependency structure and to better under- stand the link between different representations of the syntactic structure.
    [Show full text]
  • Some Observations on the Hebrew Desiderative Construction – a Dependency-Based Account in Terms of Catenae1
    Thomas Groß Some Observations on the Hebrew Desiderative Construction – A Dependency-Based Account in Terms of Catenae1 Abstract The Modern Hebrew (MH) desiderative construction must obey four conditions: 1. A subordinate clause headed by the clitic še= ‘that’ must be present. 2. The verb in the subordinate clause must be marked with future tense. 3. The grammatical properties genus, number, and person tend to be specified, i.e. if the future tense affix is underspecified, material tends to appear that aids specification, if contextual recovery is unavailable. 4. The units of form that make up the constructional meaning of the desiderative must qualify as a catena. A catena is a dependency-based unit of form, the parts of which are immediately continuous in the vertical dimension. The description of the individual parts of the desiderative must address trans-, pre-, and suffixes, and cliticization. Catena-based morphology is representational, monostratal, dependency-, construction-, and piece-based. 1. Purpose, means and claims The main purpose of this paper is to analyze the Hebrew desiderative construction. This construction is linguistically interesting and challenging for a number of reasons. 1. It is a periphrastic construction, with fairly transparent compositionality. 2. It is transclausal, i.e. some parts of the construction reside in the main clause, and others in the subordinated clause. The complementizer is also part of the construction. 3. The construction consists of more than one word, but it does not qualify as a constituent. Rather the construction cuts into words. 4. Two theoretically 1 I want to thank Outi Bat-El (Tel Aviv University) and three anonymous reviewers for their help and advice.
    [Show full text]
  • Clitics in Dependency Morphology
    Clitics in Dependency Morphology Thomas Groß Aichi University, Nagoya, Japan [email protected] tured in much the same fashion as sentences was Abstract proposed first in Williams (1981), and further discussed in the famous “head-debate” between Clitics are challenging for many theories of grammar Zwicky (1985a) and Hudson (1987). In contem- because they straddle syntax and morphology. In most porary morphological theories that attempt to theories, cliticization is considered a phrasal pheno- inform syntax (predominantly within the genera- menon: clitics are affix-like expressions that attach to tive framework) such as Di Sciullo (2005) and whole phrases. Constituency-based grammars in par- the theory of Distributed Morphology (Halle and ticular struggle with the exact constituent structure of such expressions. This paper proposes a solution Marantz 1993, Embick and Noyer 2001, 2007, based on catena-based dependency morphology. This Harley and Noyer 2003, Embick 2003), words theory is an extension of catena-based dependency are now seen as hierarchically structured items. syntax. Following Authors et.al. (in press), a word or Seen in the light of this development, it is time a combination of words in syntax that are continuous for dependency grammar (DG) to make up for its with respect to dominance form a catena. Likewise, a neglect of morphological matters. The assess- morph or a combination of morphs that is continuous ment by Harnisch (2003) that the development of with respect to dominance form a morph catena. Em- a dependency-based morphology requires imme- ploying the concept of morph catena together with a diate attention is accurate.
    [Show full text]
  • Interpreting and Defining Connections in Dependency Structures
    Interpreting and defining connections in dependency structures Sylvain Kahane Modyco, Université Paris Nanterre & CNRS [email protected] Abstract This paper highlights the advantages of not interpreting connections in a dependency tree as combina- tions between words but of interpreting them more broadly as sets of combinations between catenae. One of the most important outcomes is the possibility of associating a connection structure to any set of combinations assuming some well-formedness properties and of providing a new way to define de- pendency trees and other kinds of dependency structures, which are not trees but “bubble graphs”. The status of catenae of dependency trees as syntactic units is discussed. 1 Introduction The objective of this article is twofold: first, to show that dependencies in a dependency tree or a similar structure should not generally be interpreted only as combinations between words; second, to show that a broader interpreta- tion of connections has various advantages: It makes it easier to define the dependency structure and to better under- stand the link between different representations of the syntactic structure. We will focus on the possible syntactic combinations (what combines with what), without considering the fact that combinations are generally asymmetric, linking a governor to a dependent. We use the term connection to denote a dependency without the governor-depen- dency hierarchy. Section 2 presents the notions of combination and connection, which are based on the notion of catena (Osborne et al. 2012). The properties of the set of catenae are studied and used to define an equivalence relation on the set of combinations, which is used to define the connections.
    [Show full text]
  • Defining Dependencies (And Constituents)
    Gerdes K., Kahane S. (2013) Defining dependency (and constituency), in K. Gerdes, E. Hajičová, L. Wanner (éds.), Computational Dependency Linguistics, IOS Press. Defining dependencies (and constituents) a b Kim GERDES and Sylvain KAHANE a LPP, Sorbonne Nouvelle, Paris b Modyco, Université Paris Ouest Nanterre Abstract. The paper proposes a mathematical method of defining dependency and constituency provided linguistic criteria to characterize the acceptable fragments of an utterance have been put forward. The method can be used to define syntactic structures of sentences, as well as discourse structures for texts, or even morphematic structures for words. Our methodology leads us to put forward the notion of connection, simpler than dependency, and to propose various representations which are not necessarily trees. We also study criteria to refine and hierarchize a connection structure in order to obtain a tree. Keywords. Connection graph, dependency tree, phrase structure, syntactic representation, syntactic unit, catena Introduction Syntacticians generally agree on the hierarchical structure of syntactic representations. Two types of structures are commonly considered: Constituent structures and dependency structures (or mixed forms of both, like headed constituent structures, sometimes even with functional labeling, see for example Tesnière’s nucleus [34], Kahane’s bubble trees [21], or the Negra corpus’ trees [6]). However, these structures are often rather purely intuition- based than well-defined and linguistically motivated, a point that we will illustrate with some examples. Even the basic assumptions concerning the underlying mathematical structure of the considered objects (ordered constituent tree, unordered dependency tree) are rarely motivated (why should syntactic structures be trees to begin with?). In this paper, we propose a definition of syntactic structures that supersedes constituency and dependency, based on a minimal axiom: If an utterance can be separated into two fragments, we suppose the existence of a connection between these two parts.
    [Show full text]
  • Catena Operations for Unified Dependency Analysis
    Catena Operations for Unified Dependency Analysis Kiril Simov and Petya Osenova Linguistic Modeling Department Institute of Information and Communication Technologies Bulgarian Academy of Sciences Bulgaria {kivs,petya}@bultreebank.org Abstract to its syntactic analysis. On the other hand, a uni- fied strategy of dependency analysis is proposed The notion of catena was introduced origi- via extending the catena to handle also phenomena nally to represent the syntactic structure of as valency and other combinatorial dependencies. multiword expressions with idiosyncratic Thus, a two-fold analysis is achieved: handling the semantics and non-constituent structure. lexicon-grammar relation and arriving at a single Later on, several other phenomena (such means for analyzing related phenomena. as ellipsis, verbal complexes, etc.) were The paper is structured as follows: the next formalized as catenae. This naturally led section outlines some previous work on catenae; to the suggestion that a catena can be con- section 3 focuses on the formal definition of the sidered a basic unit of syntax. In this paper catena and of catena-based lexical entries; sec- we present a formalization of catenae and tion 4 presents different lexical entries that demon- the main operations over them for mod- strate the expressive power of the catena formal- elling the combinatorial potential of units ism; section 5 concludes the paper. in dependency grammar. 2 Previous Work on Catenae 1 Introduction The notion of catena (chain) was introduced in Catenae were introduced initially to handle lin- (O’Grady, 1998) as a mechanism for representing guistic expressions with non-constituent structure the syntactic structure of idioms.
    [Show full text]
  • Transformational Grammarians and Other Paradoxes Abstract Keywords
    Barcelona, 8-9 September 2011 Transformational Grammarians and other Paradoxes Thomas Groß Aichi University 1 Machihata-cho 1-1, Toyohashi-shi, Aichi-ken, Japan [email protected] Abstract This paper argues that morphs can/should be granted node status in tree structures. Most theories of morphology do not do this. For instance, word-based morphologies (Anderson 1992 and others) see inflectional affixes appearing post-syntactically, producing a specific word form based on paradigmatic rules. On the other hand, derivational affixes attach prior to syntax. So-called “bracketing paradoxes” (Williams 1981, Pesetsky 1985, Spencer 1988, Sproat 1988, Beard 1991, Stump 1991) such as transformational grammarian concern primarily derivational affixes (here: -ian). If a theory can avoid bracketing (or structural) paradoxes by maintaining an entirely surface-based account, then this theory should be preferred over competing theories that posit different levels or strata in order to explain the same phenomenon. This contribution demonstrates that such a surface-based account is possible if the catena is acknowledged. The catena is a novel unit of syntax and morphosyntax that exists solely in the vertical dominance dimension. Keywords Bracketing paradox, catena, morph catena 1 Introduction (Williams 1981:219f) is credited with introducing “bracketing paradoxes” to theoretical linguistics. He puts forth examples such as the following: (1) a. hydroelectricity (2) a. Gödel numbering (3) a. atomic scientist For example (1a) Williams posits the next structure: (1) b. [hydro-[electric-ity]] The problem with (1b), Williams realizes, is that the structure cannot provide the adjective hydroelectric because the prefix and the root do not appear within a bracket that excludes the suffix.
    [Show full text]
  • A Lexical Theory of Phrasal Idioms
    A Lexical Theory of Phrasal Idioms Paul Kay Ivan A. Sag U.C. Berkeley & Stanford University Stanford University Dan Flickinger Stanford University Abstract This paper presents a theory of syntactically flexible phrasal idioms that explains the properties of such phrases, e.g. keep tabs on, spill the beans, in terms of general combinatoric restrictions on the individual idiomatic words (more precisely, the lexemes) that they contain, e.g. keep, tabs, on. Our lexi- cal approach, taken together with a construction-based grammar, provides a natural account of the different classes of English idioms and of the idiosyn- crasies distinguishing among particular phrasal idioms. 1 Introduction Among the most important desiderata for a theory of idioms, also known as multi- word expressions (MWEs), are the following four. First, idioms divide into classes whose distinct properties, described below, need to be theoretically accommo- dated. Second, the theory should get the semantics right. It should, for example, represent the fact that (1a) and (1b) have roughly the same meaning, which can be glossed as (1c):1 0We would like to thank Frank Richter, Manfred Sailer, and Thomas Wasow for discussion of some of the issues raised here, Stephen Wechsler for comments on an earlier draft, and especially Stefan Muller¨ for detailed comments and extensive discussion. We are also indebted for highly pertinent comments to two anonymous referees for the Journal of Linguistics. None are to be blamed for the good advice we have failed to accept. 1All unstarred examples in this paper were attested on the web at the time of drafting, unless otherwise noted.
    [Show full text]
  • Catenae in Morphology
    Catenae in Morphology Thomas Groß Aichi University, Nagoya, Japan [email protected] takes the backseat with just 8 pages. Valency Abstract theory is characterized by putting valency- bearing lexical items at center stage. Assuming This paper argues for a renewed attempt at morpholo- that non-lexical material is somehow subsumed gy in dependency grammar. The proposal made here by lexical material seems on a logical trajectory. is based on the concept of the “catena” proposed by But research in typology, foremost Bybee (1985), Authors (in press). The predecessor to this notion was has confirmed that affixes as expressions of va- the “chain” introduced by O‟Grady (1998), and em- lency, voice, aspect, modality, tense, mood, and ployed by Osborne (2005) and Groß and Osborne (2009). In morphology and morphosyntax, a morph person obtain in a specific linear order (or hie- catena is A MORPH OR A COMBINATION OF MORPHS rarchy), and developments in generative gram- THAT IS CONTINUOUS WITH RESPECT TO DOMINANCE. mar during the 1980‟s emphasized the domin- This concept allows for a parsimonious treatment of ance structure of the IP/TP, where such affixes morphology on the surface. The fact that no additional are thought to be located. Similar statements also terms and concepts are necessary for the analysis of concern NP structure: if case or plural is ex- morphological data is highly desirable because it pressed by morphs, then these morphs appear in makes a fluid transition from syntax to morphology peripheral position, an indication that they domi- possible. This paper introduces the relevant depen- nate their nouns.
    [Show full text]
  • Why So Many Nodes? Dan Maxwell US Chess Center Washington DC, USA [email protected]
    Why so many nodes? Dan Maxwell US Chess Center Washington DC, USA [email protected] Abstract relationships include subject and verb, (more This paper provides an analysis of the rarely) object and verb, noun and adjective, representation of various grammatical phenomena and noun and determiner. Highly inflected in both constituency structure and dependency languages like Russian have all of these kinds structure (hereafter c-structure and d-structure), of agreement. English has only two of them including agreement, case marking, and word order (and then only to a limited extent). So-called in transitive sentences, as well as three theoretical isolating languages like Vietnamese do not constructs, and the interface between the form of a sentence and its meaning. There is a crucial show agreement at all. The agreement can be structural relationship for all of these phenomena in in person, number, gender, and (for case- both constituency grammar and dependency inflecting languages only) case. grammar, but only the version used in dependency English has agreement between the subject grammar is fully reliable. This insight evokes the and the verb in the third person singular of the following question: Why do linguists working with present tense, as shown by the contrast constituency grammars think that so many nodes between (1a) on the one hand and (1b) and (1c) are necessary? Upon examination, the extra nodes on the other: succeed only in confusing our understanding of syntactic phenomena. (1) a. John walks slowly. 1 Introduction b. John and Sarah walk quickly. c. I walk more quickly than they do.
    [Show full text]
  • Tests for Constituents: What They Really Reveal About the Nature of Syntactic Structure
    http://www.ludjournal.org ISSN: 2329-583x Tests for constituents: What they really reveal about the nature of syntactic structure Timothy Osbornea a Department of Linguistics, Zhejiang University, [email protected]. Paper received: 17 June 2017 DOI: 10.31885/lud.5.1.223 Published online: 10 April 2018 Abstract. Syntax is a central subfield within linguistics and is important for the study of natural languages, since they all have syntax. Theories of syntax can vary drastically, though. They tend to be based on one of two competing principles, on dependency or phrase structure. Surprisingly, the tests for constituents that are widely employed in syntax and linguistics research to demonstrate the manner in which words are grouped together forming higher units of syntactic structure (phrases and clauses) actually support dependency over phrase structure. The tests identify much less sentence structure than phrase structure syntax assumes. The reason this situation is surprising is that phrase structure has been dominant in research on syntax over the past 60 years. This article examines the issue in depth. Dozens of texts were surveyed to determine how tests for constituents are employed and understood. Most of the tests identify phrasal constituents only; they deliver little support for the existence of subphrasal strings as constituents. This situation is consistent with dependency structure, since for dependency, subphrasal strings are not constituents to begin with. Keywords: phrase structure, phrase structure grammar, constituency tests, constituent, dependency grammar, tests for constituents 1. Dependency, phrase structure, and tests for constituents Syntax, a major subfield within linguistics, is of course central to all theories of language.
    [Show full text]
  • CATENAE Cambridge
    CATENAE Cambridge Catena: A WORD OR A contains “Catena” is the COMBINATION OF WORDS sentence catenae Latin word for THAT IS CONTINUOUS WITH This 15 „chain‟. RESPECT TO DOMINANCE This sentence contains 15 catenae. According to the definition, any tree or subtree of a tree is a catena. Individual words (5): this, sentence, contains, 15, catenae; Word combinations (9): This sentence, This sentence contains, This sentence con- tains…catenae, sentence contains, sentence contains…catenae, sentence contains 15 catenae, contains…catenae, contains 15 catenae and 15 catenae; The entire sentence (1). 5+9+1=15 Total number of word combinations = 25(= number of words) -1 = 31 Number of non-catena word combinations = 31 – 15 = 16 Non-catenae word combinations (16): This…contains, This…15, This…catenae, This…contains 15, This…contains…catenae, This…contains 15 catenae, This…15 catenae, sentence…15, sentence…catenae, sentence contains 15, sentence…15 catenae, contains 15, This sentence…15, This sentence…catenae, This sentence…15 catenae, This sentence con- tains 15 The catena is central to the analysis of IDIOMS, ELLIPSIS, PREDICATES, CONSTRUCTIONS, DIS- PLACEMENT and of many other phenomena. Idioms: threw Ellipsis: have She book at boys played -Gapping the him The soccer in She threw the book at him. yard the Predicates: heard The boys have played soccer in the yard, and He her the girls have played hockey in the street. He heard her. Constructions: works SV-construction of English will That really VERBfinite He have That really works SUBJECT heard klappt V2-construction of German her Vielleicht das VERBfinite He will have heard her.
    [Show full text]