<<

Lexical of the eternal problem of how to specify what an and individual sense of a ‘means.’ The study of is less ‘autonomous’ than that of, say, , or . Especially if 1. Introduction one takes a cognitive linguistic view, there is no clear dividing line between lexical semantics and the study Each has a and a , i.e., a set of conceptual categories within cognitive psychology; of elementary expressions and a set of rules according and advances in one field tend to have repercussions in to which complex expressions are constructed from the other. Advances in certain areas of psycho- simpler ones. Some of these rules form complex ; can also be expected to throw light on word others operate beyond the boundaries of the word, meaning. For instance, there is currently a developing thus producing phrases and sentences. These distinc- body of work on the time course of semantic ac- tions, familiar from the days of the Greek gram- tivation. The meaning of a word is not activated all at marians, are not always clear cut, for at least two once when a word is recognized, and the details of the reasons. First, the notion of ‘word’ is not very well- activation process cannot fail to have relevance to our defined(seealsoWordClassesandPartsofSpeech).Sec- understanding of the internal structure of a word’s ond, there are complex expressions, whose meaning is meaning. more or less predictable from the meaning of its One area of practical concern, which is poised for a components, whereas this is not true for other complex major take-off, but is currently held back by lexical expressions. The former are said to be ‘compositional,’ semantic problems, is the automatic processing of whereas the latter are ‘lexicalized’; slightly different natural language by computational systems. The main terms to characterize this opposition are ‘productive’ problems are the complexity of natural meanings and vs. ‘idiomatic,’ and ‘free’ vs. ‘fixed’; in each their contextual variability. The work currently being case, the distinction is gradual. is rarely done in this area can be expected to spill over, not only observed for inflected words (a possible exception are into general lexicography, but also into the linguistic ‘participles’ such as crooked in a crooked street), but study of word meanings. very frequent for words, such as landlord or (to) withdraw, or phrases such as to kick the bucket, See also: Dementia, Semantic; Lexical Processes which has a compositional as well as a lexicalized reading. Do lexicalized expressions belong to the (Word Knowledge): Psychological and Neural As- lexicon of a language or to its grammar? There is no pects; Lexicology and Lexicography; Lexicon; Se- straightforward answer; their form is complex and mantic Knowledge: Neural Basis of; Semantics rule-based, their meaning is not. Therefore, it is useful to take the term ‘lexicon’ in a somewhat broader sense; it contains all elementary expressions (lexicon in the Bibliography narrower sense) as well as those expressions which are compound in form but not accordingly in meaning Cruse D A 1986 Lexical Semantics. Cambridge University Press, (see also Lexicon). The scientific investigation of the Cambridge, UK lexicon in this sense is usually called lexicology; it Cruse D A 2000 Meaning in Language: An Introduction to includes, for example, the historical development of Semantics and . Oxford University Press, Oxford. UK the lexicon, its social stratification, its quantitative Levin B, Hovav Rappaport M 1992 Wiping the slate clean: A composition or the way in which some subfield is lexical semantic exploration. In: Levin B, Pinker S (eds.) encoded in lexical items (e.g., ‘ of hunting,’ Lexical and Conceptual Semantics. Blackwell, Oxford, UK, pp. ‘ of movement’). Lexicography, by contrast, 123–52 deals with the compilation of . There is Lyons J 1963 . Cambridge University Press, considerable overlap between both disciplines, and in Cambridge, UK fact, not all authors make such a terminological Lyons J 1968 Introduction to . Cambridge distinction. University Press, Cambridge, UK Lyons J 1977 Semantics. Cambridge University Press, Cam- bridge, UK Nida E A 1975 Componential Analysis of Meaning: An In- troduction to Semantic Structures. Mouton, The Hague, The 2. The Lexicon Netherlands The lexicon of a language is stored primarily in the Taylor J R 1989 Linguistic : Prototypes in Linguis- head of its speakers, and for most of the history of tic Theory. Clarendon Press, Oxford, UK Ungerer F, Schmid H-J 1996 An Introduction to CognitiŠe mankind, it was only stored there. We do not know Linguistics. Longman, London what form the ‘’ has (see also Psycho- Wierzbicka A 1996 Semantics: Primes and UniŠersals. Oxford linguistics: OŠerŠiew). There is agreement, however, University Press, Oxford, UK that it consists of individual lexical units which are somehow interrelated to each other. There is no D. A. Cruse generally accepted term for lexical units. The familiar

8764 Lexicology and Lexicography term ‘word’ is both too broad and too narrow; one Schwarze and Wunderlich 1985); in fact, if there is any would not want to consider goes as a lexical unit, piece of for some language, it is although it is a word, whereas expressions such as (to) probably an elementary bilingual . The cut up or red herring are lexical units but consist of depth of this work varies massively not only across several words. Other terms occasionally found are , but also with respect to the particular ‘,’ ‘,’ or ‘lexical entry,’ but since these are lexical properties. Whereas the phonological, graph- also used in other ways, it is probably best to speak of ematic and morphosyntactic features of the lexicon in lexical units. Latin, English, French, and some dozen other It is important to distinguish between a lexical unit languages with a comparable research tradition are and the way in which it is named. The word house in a fairly well described, there is no theoretically and dictionary, followed by all sorts of explanations, is not empirically satisfactory analysis of the semantics of the lexical unit—it is a for such a unit. The the lexicon for any language whatsoever. This has lexical unit itself is a bundle of various types of three interrelated reasons. First, there is no well- properties. These include: defined descriptive language which would allow the (a) phonological properties, which characterize how researcher to represent the meaning of some lexical the lexical unit is pronounced; they include sounds, unit, be it simple or compound; the most common syllabic structure, lexical accent and, in some practice is still to paraphrase it by an expression of the languages, lexical tone; same language. Second, there is no reliable and easily (b) graphematic properties, which characterize how applicable method of determining the lexical meaning the lexical unit is written (see also ); of some unit; the most common way is to look at a (c) morphosyntactic properties, which characterize number of occurrences in ongoing text and to try to how the unit can become part of more complex understand what it means. Third, the relation between expressions; typically, they concern inflectional para- a particular form and a particular meaning is hardly digm, word class, government relations, and others; ever straightforward; this is strikingly illustrated by a (d) semantic properties, which concern the ‘lexical look at what even a medium-sized English dictionary meaning’ of the unit, i.e., the contribution which it has to say about the meaning of, for example, on, makes to the meaning of the construction in which it sound, eye or (to) put up. As a rule, there is not just one occurs. lexical meaning, but a whole array of uses which are Some of these properties may be absent. This is more or less related to each other. This is not merely a most obvious for graphematic properties, since not all practical problem for the lexicographer; it also casts languages are written. There are a few lexical units some doubt on the very notion of ‘lexical unit’ itself without lexical meaning, such as the expletive there in (see also Lexical Semantics). English. Many linguists also stipulate ‘zero elements,’ i.e., units with morphosyntactic and semantic - perties but without phonological properties (such as 3. Making Dictionaries ‘empty ’); but these are normally treated in the grammar rather than in the lexicon. Lexicographers often consider their work to be more Whereas these four types of properties are the of an art or a craft than a science (see, e.g., Landau defining characteristics of a lexical unit, other in- 1984, Svense! n 1993). This does not preclude a solid formation may be associated with it, for example, its scientific basis, but it reflects the fact that their concrete , its frequency of usage, its semantic work depends largely on practical skills such as being counterpart in other languages, or encyclopedic ‘a good definer,’ on one hand; and that it is to a great knowledge (thus, it is one thing to know the meaning extent determined by practical, often commercial, of bread and a different thing to know various sorts of concerns, on the other. Dictionaries are made for bread, how it is made, its price, its role in the history of users, and they are intended to serve specific purposes. mankind, etc.). Their compilation requires a number of practical The lexical units of a lexicon are in many ways decisions. interrelated. They may share some phonological prop- erties (for example, they may rhyme with each other), they may belong to the same inflectional paradigm, 3.1 Which Lexical Units are Included? they may have the meaning (‘antonyms,’ such as black and white), approximately the same Languages are neither well-defined nor uniform enti- meaning (‘,’ such as to begin and to start), or ties; they change over time, and they vary with factors when complex in form they may follow the same such as place, social class, or area talked about. A construction pattern. Lexicological research is often great deal of this variation is lexical. It is not possible oriented towards these interrelations, whereas lexi- nor would it be reasonable to cover this wealth in a cography tends to give more weight to the lexical unit single dictionary. Large dictionaries contain up to in itself. In general, there is much more lexicographical 300,000 ‘entries’; since idiomatic expressions are than lexicological work (for a survey of the latter, see usually listed under one of their components (such as

8765 Lexicology and Lexicography to kick the bucket under (to) kick), they contain many thus, all lexical units which include the word put may more lexical units, perhaps up to 1 million. But, even be listed under this head word, forming a kind of nest so, they are by no means exhaustive. The second with an often very complex microstructure. We are edition of the Deutsches WoW rterbuch (see Sect. 5), the used to alphabetically-ordered dictionaries; but there largest dictionary of German, covers less than 25 per are other possibilities, for example, by thematic groups cent of the lexical units found in the sources, and these or by first appearance in written documents. sources are quite restricted themselves. Languages without alphabetic require different principles; in Chinese, for example, entries are usually arranged by subcomponents of the entire character and by the number of strokes. 3.2 Which Lexical Properties are Described? These four questions can be answered in very Just as it is impossible to include all lexical units of a different ways, resulting in very different types of language in a dictionary, it is neither possible nor dictionaries (see the survey in Hausmann et al. 1991, desirable to aim at a full description of those which are pp. 968–1573). included. Since a dictionary is normally a printed book, the graphematic properties of the unit (its ‘spelling’) are automatically given. Among the other 4. History defining properties, meaning is traditionally con- sidered to be most important. Samuel Johnson’s The first lexicographic documents are lists of Sumerian dictionary from 1755 (see Sect. 5) defines ‘dictionary’ words (up to 1400) with their Akkadian equivalents, as ‘A book containing the words of any language in written in cuneiform script on clay tablets about 4,700 alphabetical order, with explanations of their mean- years ago. The practice compiling such word lists was ing.’ But Johnson also noted which syllable carries the continued throughout Antiquity and the Middle Ages; main , and he gave some grammatical hints. In thus, the oldest document in German, the Abrogans general, however, information on phonological prop- (written around 765), is an inventory of some Latin erties was rare up to the end of the nineteenth century, words with explanations in German. Usually, these and information on grammatical properties is usually ‘’ did not aim at a full account of the lexicon; still very restricted in nonspecialized dictionaries. But they simply brought together a number of words there are, of course, also dictionaries which specifically which, for one reason or another, were felt to be address these properties as well as some of the ‘difficult,’ and explained them either by a more familiar nondefining properties associated with a lexical entry, word in the same language or by a . Words such as its origin () or, above were ordered alphabetically, by theme, or not at all. all, its equivalent in other languages (‘bilingual dic- But there are also more systematic attempts, such as tionary’). the Catholicon, a mixture of and dic- tionary which, compiled around 1250, was the first printed lexical work in Europe (Mainz 1460). In the sixteenth century, two developments led to 3.3 What is the Description Based Upon? major changes. The first of these was the invention of Usually, two types of sources are distinguished: printing by Gutenberg. By 1500, virtually all classical ‘primary sources’ are samples of text in which the unit authors were available in print, thus offering a solid is used, ‘secondary sources’ refers to prior work of basis for systematic lexical accounts of Latin and other lexicographers (and lexicologists). In fact, there Greek, such as Calepinus’ Dictionarium (1502), soon is a third source, normally not mentioned in the to be followed by two early masterpieces: Robert theory of lexicography (sometimes called ‘meta-lexico- Etienne’s Dictionarium seu Latinae Linguae graphy’): this is the lexicographer’s own knowledge of (Paris 1531) and Henri Etienne’s Thesaurus Graecae the language to be described, including his or her views Linguae (Paris 1572). The second major development on what is ‘good’ language. In practice, the bulk of a was the slow but steady rise of national languages. new dictionary is based on older dictionaries. This is Since early Italian, French, English, or German were always immoral and often illegal, if these are simply hardly codified, a major aim of the first dictionaries in copied; but on the other hand, it would be stupid and these languages was to give them clear norms. In some arrogant to ignore the achievements of earlier lexico- countries, national Academies were founded to this graphers. end. The outcome were dictionaries with a strongly normative, often puristic, stance, such as the Vocabulario degli Academici della Crusca (Venice 1612), the Dictionnaire de l’AcadeT mie Francm aise (Paris 3.4 How is the Information Presented? 1694) and the Diccionario de autoridades publicado por A dictionary consists of lexical entries arranged in la Real Academia Espanola (1726–1739). some conventional order. Normally, an entry com- The bulk of lexicographic work, however, was bines several lexical units under a single ‘head word’; always done by enterprising publishers and engaged

8766 Lexicology and Lexicography individuals, such as Dr Samuel Johnson. Helped by six other author may aspire to praise; the lexicographer assistants, he produced A Dictionary of the English can only hope to escape reproach.’ (For a com- Language (London 1755), the first scholarly descrip- prehensive survey of lexicographic work across lan- tion of the English , in less than eight years. guages, see Hausmann et al. 1991, pp. 1679–2710, It surpassed all its predecessors, including Bailey’s 2949–3119). Dictionarium Britannicum from 1736, which Johnson took as his point of departure, by the systematic use of quotations, taken from the best writers, and by his 5. The Use of Computers brilliant, sometimes somewhat extravagant, defi- nitions (not everybody would dare to characterize We tend to think of dictionaries as the normal, if not patriotism as ‘the last refuge of a scoundrel’). Less the only possible way to compile and to present lexical known, much less witty, but broader in coverage is the information. But the dawn of the computer has first comprehensive dictionary of German, Johann provided us with a very different and in many ways Christoph Adelungs Versuch eines ŠollstaW ndigen gram- more efficient tool. Computers can be used in at least matisch-kritischen WoW rterbuchs der hochdeutschen three ways in lexicography. It is possible to transfer an Mundart (Leipzig 1774–86). existing dictionary to a computer, as has often been The rise of historical- in the done over the last 20 years. Such a transfer offers early nineteenth century led to an enormous increase several advantages: search is faster and more exhaus- in grammatical and lexical knowledge. The first tive; it is easier to revise and update the dictionary; and dictionary which tried to cover this knowledge was it is possible to add information not available in book the Deutsches WoW rterbuch by Jacob Grimm and (to a format, for example, spoken sound instead of phonetic much lesser extent) his brother Wilhelm Grimm. Its transcriptions. But essentially, the format of the first fascicle appeared in 1852, after about ten years of printed book is maintained. Next, computers are a preparatory work, in which the Grimms were helped powerful tool in the production of a new dictionary. by about 100 scholars providing excerpts (‘covering Rather than having a number of people read through my desk like snowflakes,’ Jacob Grimm). At that time, books and newspapers and make excerpts of all it was already clear that the original plan of 6–7 occurrences which look interesting, it is now possible volumes, to be finished within 10–12 years, was un- to compile huge text corpora that cover all varieties of realistic. The Grimms finished only letters A–(most a language, to scan these texts for all occurrences of of) F, and the final folio volume (of altogether 32) words or word combinations, to sort these occurrences appeared in 1960. This long duration, as well as the by various criteria, to link them to other occurrences, varying talents and preferences of the contributors, to add as much context as needed, etc. (see Corpus has led to many inconsistencies; some entries got out Linguistics). of balance (no less than 60 pages are devoted to the The OED, is based on about 5 million excerpts, single word Geist); still, it is an incommensurable mostly handwritten on paper slips. A computer can source of lexical information. easily process corpora of several hundred million The work of the Grimms inspired a number of words, i.e., several hundred million occurrences; new similar ventures, such as Emile Littre! ’s masterly sources can rapidly be added. This allows a much Dictionnaire de la langue francm aise (1863–1873), which broader and much more representative coverage of a is much shorter, but also much more consistent: lexicon than ever. But electronic corpora only provide Matthias de Vries and his numerous successors’ the raw material; it still awaits . This voluminous Woordenboek der Nederlandsche Taal analysis can be facilitated by computer tools, also; but (1864–1998), and finally A New English Dictionary on no computer can tell us what a word means in a a Historical Basis (1884–1928), generally referred to as particular context. But even if only one minute is the ‘Oxford English Dictionary’ (OED). It was devoted to each occurrence in a one-hundred million initiated in 1857 by the philologist and churchman corpus, it would take 10 lexicographers 100 years to go Richard Trench; in 1860, members of the Philological through it. This means that printed dictionaries can Society started to collect excerpts; in 1879, the Claren- never reflect the wealth of information accessible in don Press appointed James Murray as the Principal large corpora, since they presuppose that the lexi- Editor. The first fascicle appeared in 1882, and the cographer has finished the analysis. Therefore, the whole work was completed in 1928, 13 years only way to make full use of large corpora is by means after Murray’s death. More than 200 scholars were of lexical retrieval systems. They consist of (a) a involved in its production, more than 2,000 people are computer-accessible and expandable corpus, (b) a set known to have contributed excerpts. The OED is not of tools, which allow, for example, not only the search without flaws, even in its revised edition, which for certain items but also statistical analysis or the appeared in 1989 in print and in 1992 on CD-ROM; determination of the first occurrence, and (c) a but among all attempts to describe the lexicon of a selective but steadily proceeding lexical analysis of the language, it comes closest to falsify what Dr Johnson corpus. Thus, it is possible to add spoken forms in stated in the preface to his own dictionary: ‘Every various , information about word classes, or

8767 Lexicology and Lexicography the semantic analysis of some subset of lexical units, (a) How can we explain the polymorphic nature of say all prepositions or all morphologically simple language? verbs. Similarly, translation equivalents can be added. (b) How can we capture the creative use of words in Unlike printed dictionaries, such a lexical retrieval novel contexts? will never come to an end, it is steady work-in- (c) How can semantic types predictably map to progress to which many can contribute and which will syntactic representations? give us a deeper and broader understanding of the (d) What are the ‘atoms’ of lexical knowledge, if lexicon than any other method. they exist at all? In this article, we first review the conventional view See also:; ; Lexical Access, Cog- of the lexicon and then contrast this with the theories nitive Psychology of; Lexical Processes (Word Know- of lexical information that have emerged since around ledge): Psychological and Neural Aspects; Lexical 1990. Semantics By all accounts, the conventional model of the lexicon is that of a database of words, ready to act in the service of more dynamic components of the grammar. This view has its origins squarely in the Bibliography generative tradition (Chomsky 1955) and has been an Hausmann F-J, Reichmann O, Wiegand H E, Zgusta L (eds.) increasingly integral part of the of the lexicon 1991 WoW rterbuW cher—Dictionaries—Dictionnaires. De Gruy- ever since. While the ‘Aspects’ model of selectional ter, Berlin features restricted the relation of selection to that Landau S A 1984 Dictionaries: The Art and Craft of Lexico- between lexical items, work by Jackendoff (1972) and graphy. Scribner, New York McCawley (1968), showed that selectional restrictions McArthur T 1986 Worlds of Reference. Lexicography, Learning must be available to computations at the level of and Language from the Clay Tablet to the Computer. Cam- derived semantic representation rather than at deep bridge University Press, Cambridge, UK Schwarze C, Wunderlich D (eds.) 1985 Handbuch der Lexi- structure. But where did this view come from? In order kologie. Athena$ um, Ko$ nigstein, Germany to understand both the classical model of the lexicon Svense! n B 1993 Practical Lexikography. Principles and Methods as a database and the current models of lexically of Dictionary Making. Oxford University Press, Oxford, UK encoded grammatical information, it is necessary to appreciate the structuralist distinction between ‘syn- W. Klein tagmatic processes’ and ‘paradigmatic systems’ in language. The lexicon has emerged as the focal point communicating between these two components, and can be seen as a hook which links the information at these two levels. One can go further still and view the elements of the lexicon as not just the building blocks Lexicon for the more active components of the grammar, but also as actively engaging the building principles 1. The Notion of Lexicon themselves. While syntagmatic processes refer to the influence of The lexicon is standardly viewed as a listing of all the horizontal elements on a word or phrase, paradigmatic of a language, with information indicating systems refer to vertical substitutions in a phrasal how each behaves in the components of structure. Syntagmatics evolved into the theory of grammar involving phonology, syntax, and semantics. abstract syntax while paradigmatics was all but aban- In no small part, the shape and character of grammar doned in generative linguistics. In an early discussion is determined by what the lexicon contains for these of syntagmatic dependencies, Hjelmslev (1943) uses other grammatical devices. Nevertheless, both his- the term ‘selection’ explicitly in the modern sense and torically and conventionally, the lexicon has been seen notes the importance of integrating paradigmatic as the passive module in the system of grammar. systems with the syntagmatic processes they partici- More recently, the model of the lexicon has under- pate in. For Hjelmslev, there are two possible types of gone significant revision and maturation. In particu- relations that can exist between elements in a syn- lar, two trends have driven the architectural concerns tagmatic process: ‘interdependence’ and ‘determi- of lexical researchers: (a) a tighter integration of nation’, the latter of which is related to the notion of compositional operations of syntax and semantics selectional restriction as developed by Chomsky with the lexical information structures that bear them; (1965). As Cruse (1986) notes ‘One reason that and (b) a serious concern with how lexical types reflect selectional restrictions were not integrated into mech- the underlying ontological commitments of the gram- anisms of grammatical selection and description in the mar. In the process, the field has moved towards 1970s and 1980s is that, if they are imposed correctly, addressing more encompassing problems in linguistic the grammar is forced to model two computations: theory, such as those below: (a) the entailment relations between selectional

8768

Copyright # 2001 Elsevier Science Ltd. All rights reserved. International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7