3 Compounding

66 Nigel Fabb 3 Compounding NIGEL FABB 1 Overview A compound is a word which consists of two or more words. For example, the Malay compound mata-hari ‘sun’ is a word which consists of two words: mata ‘eye’ and hari ‘day’. Compounds are subject to phonological and morphological processes, which may be specific to compounds or may be shared with other structures, whether derived words or phrases; we explore some of these, and their implications, in this chapter. The words in a compound retain a meaning similar to their meaning as isolated words, but with certain restrictions; for example, a noun in a compound will have a generic rather than a referential function: as Downing (1977) puts it, not every man who takes out the garbage is a garbage man. 1.1 Structure and interpretation The meaning of a compound is usually to some extent compositional, though it is often not predictable. For example, popcorn is a kind of corn which pops; once you know the meaning, it is possible to see how the parts contribute to the whole – but if you do not know the meaning of the whole, you are not certain to guess it by looking at the meaning of the parts. This lack of predict- ability arises mainly from two characteristics of compounds: (a) compounds are subject to processes of semantic drift, which can include metonymy, so that a redhead is a person who has red hair; (b) there are many possible semantic relations between the parts in a compound, as between the parts in a sentence, but unlike a sentence, in a compound, case, prepositions and structural position are not available to clarify the semantic relation. 1.1.1 Endocentric and exocentric compounds Compounds which have a head are called ‘endocentric compounds’. A head of a compound has similar Compounding 67 characteristics to the head of a phrase: it represents the core meaning of the constituent, and it is of the same word class. For example, in sneak-thief, thief is the head (a sneak-thief is a kind of thief; thief and sneak-thief are both nouns). Compounds without a head are called ‘exocentric compounds’ or ‘bahuvrihi compounds’ (the Sanskrit name). The distinction between endocentric and exocentric compounds is sometimes a matter of interpretation, and is often of little relevance; for example, whether you think greenhouse is an endocentric or exocentric compound depends on whether you think it is a kind of house. The major interest in the head of a compound relates to the fact that where there is a clear head, its position seems to be constrained; endocentric compounds tend to have heads in a language systematically on either the right (e.g. English) or left (e.g. Vietnamese, French). 1.1.2 Co-ordinate compounds There is a third kind of compound, where there is some reason to think of both words as equally sharing head-like characteristics, as in student-prince (both a student and a prince); these are called ‘appositional’ or ‘co-ordinate’ or ‘dvandva’ (the Sanskrit name) compounds. Co-ordinate compounds can be a combination of synonyms (example from Haitian): toro-bèf(bull-cow) ‘male cow’ a combination of antonyms (example from French): aigre-doux (sour-sweet) or a combination of parallel things (example from Malayalam): acchanammamaaIf ( father-mother-pl.) ‘parents’ Co-ordinate compounds can have special characteristics in a language. For example, in Mandarin, co-ordinate compounds behave differently in terms of theta-assignment from (endocentric) resultative verb compounds. And in Malayalam, co-ordinate compounds are not affected by gemination processes which other compounds undergo (see section 5.2). 1.1.3 The semantic relations between the parts The semantic relations between the parts of a compound can often be understood in terms of modification; this is true even for some exocentric compounds like redhead. Modifier– modifiee relations are often found in compounds which resemble equivalent phrases; this is true, for example, of English AN%N1 compounds and many Mandarin compounds (see Anderson 1985b). It is not always the case, though; the French compound est-allemand (East German) corresponds to a phrase allemand de l’est (German from the East). In addition, many compounds mani- fest relations which can be interpreted as predicator–argument relations, as in 68 Nigel Fabb sunrise or pull-chain. Note that in pull-chain, chain can be interpreted as an argument of pull, and at the same time pull can be interpreted as a modifier of chain. 1.1.4 Transparency: interpretive and formal The transparency and pre- dictability of a compound are sometimes correlated with its structural transparency. For example, in languages with two distinct types of compound where one is more interpretively transparent than the other, the less interpretively transparent type will often be subject to greater phonological or morphological modification. A diachronic loss of transparency (both formal and interpretive) can be seen in the process whereby a part of a compound becomes an affix, as in the development of English -like as the second part of a compound to become the derivational suffix -ly. 1.2 Types of compound Accounts of compounds have divided them into classes. Some of these – such as the exocentric, endocentric and appositional types, or the various interpretive types (modifier–modifiee, complement–predicator, etc.) – are widespread across languages. Then there are compound types which are language- or language- family-specific, such as the Japanese postsyntactic compounds (4.3), Hebrew construct state nominals (4.2), or Mandarin resultative verb compounds (4.1). Other types of compounds are found intermittently; these include synthetic compounds (section 1.2.1), incorporation compounds (section 1.2.2) and reduplication compounds (section 1.2.3). 1.2.1 Synthetic (verbal) compounds The synthetic compound (also called ‘verbal compound’) is characterized by a co-occurrence of particular formal characteristics with particular restrictions on interpretation. Not all languages have synthetic compounds (e.g. English does, but French does not). The formal characteristic is that a synthetic compound has as its head a derived word con- sisting of a verb plus one of a set of affixes (many writers on English restrict this to agentive -er, nominal and adjectival -ing, and the passive adjectival -en). Thus the following are formally characterized as synthetic compounds: expert-test-ed checker-play-ing (as an adjective: a checker-playing king) window-clean-ing (as a noun) meat-eat-er (There is some disagreement about whether other affixes should also be included, so that slum-clear-ance for example would be a synthetic compound.) Compounds with this structure are subject to various restrictions (summarized by Roeper and Siegel 1978), most prominent of which is that the left-hand member must be interpreted as equivalent to a syntactic ‘first sister’ of the Compounding 69 right-hand member. We discuss this in section 3.2. There have been many accounts of synthetic compounds, including Roeper and Siegel 1978, Selkirk 1982, Lieber 1983, Fabb 1984, Botha 1981 on Afrikaans, and Brousseau 1988 on Fon; see also section 2.1.2. Many of the relevant arguments are summarized in Spencer 1991. 1.2.2 Incorporation compounds In some languages, incorporation words resemble compounds: for example, both a verb and an incorporated noun may exist as independent words. Bybee (1985) comments on this, but suggests that even where the two parts may be independently attested words, an incorporation word may differ from a compound in certain ways. This includes phonological or morphological differences between incorporated and free forms of a word. (In fact, as we have seen, this is true also of some compound processes; e.g. segment loss, which is found in Tiwi incorporation words, is found also in ASL compounds.) Another difference which Bybee suggests may dis- tinguish incorporation from compounding processes is that the incorporation of a word may depend on its semantic class. For example, in Pawnee it is mainly body part words which are incorporated, while various kinds of name are not (such as personal names, kinship terms, names of particular species of tree, etc.). It is possible that compounding is not restricted in this way; as we will see, semantic restrictions on compounding tend to be in terms of the relation between the parts rather than in terms of the individual meanings of the parts. 1.2.3 Repetition compounds Whole-word reduplication is sometimes described as a compounding process, because each part of the resulting word corresponds to an independently attested word. Steever (1988), for example, describes as compounds Tamil words which are generated through reduplication: vantu ‘coming’ is reduplicated as vantu-vantu ‘coming time and again’. In another type of Tamil reduplicated compound, the second word is slightly modified: viyAparam ‘business’ becomes viyAparam-kiyAparam ‘business and such’. English examples of this type of compound include words like higgledy-piggledy, hotchpotch, and so on. 1.3 Compounds which contain ‘bound words’ In a prototypical compound, both parts are independently attested as words. However, it is possible to find words which can be parsed into an independently attested word plus another morpheme which is not an independently attested word but also does not appear to be an affix (see Aronoff 1976). Here are some examples from English (unattested part italicized): church-goer ironmonger television cranberry The part which is not attested as an isolated word is sometimes found in other words as well; in some cases it may become an attested word: for example, telly 70 Nigel Fabb (= television). These parts fail to resemble affixes morphologically (they are relat- ively unproductive compared to most affixes), and there is no good evidence on phonological grounds for considering them to be affixes.

3 Compounding

This Chapter Provides a Historical Framework For

Compound Word Formation.Pdf

Hunspell – the Free Spelling Checker

Building a Treebank for French

Syntactic-Words.Pdf

Compositionality in English Deverbal Compounds

Using the Web As an Implicit Training Set: Application to Noun Compound

Compounds and Word Trees

The Challenges of Translating English Compounds Into Arabic - for Better Or for Worse

A Poor Man's Approach to CLEF

Verbal Compounding in English: a Challenge for Usage-Based Models of Word-Formation?

Robust Prediction of Punctuation and Truecasing for Medical ASR