Statistical Grammar Models and Lexicon Acquisition

Total Page:16

File Type:pdf, Size:1020Kb

Statistical Grammar Models and Lexicon Acquisition � 12 Statistical Grammar Mo dels and Lexicon Acquisition S Schulte i W H Schmid M Rooth S Riezler D Prescher Intro duction This pap er presents a framework for developing and training statisti cal grammar mo dels for the acquisition of lexicon information Util ising a robust parsing environment and mathematically welldened unsup ervised training metho ds the framework enables us to induce lexicon information from text corp ora Particular strengths of the ap proach concern i the fact that no extensive manual work is required to set up the framework and ii that the framework is applicable to any desired language It has already b een applied to English and German Carroll and Ro oth Beil et al Ro oth et al Schulte im Walde a Portuguese de Lima and Chinese Ho ckenmaier Manual work within the framework is reduced to a minimum since the necessary grammars need not go into detailed structures for the rele vant grammar asp ects to b e trained suciently The automatic training pro cess utilises a shallow parser emb edded in the mathematically well dened Exp ectationMaximisation algorithm The training approach en forces the lexicalised parameters in the statistical grammar to obtain linguistic reliability A basic assumption thereby exp ects that the lin guistically correct analyses of text corresp ond to those analyses which Linguistic Form and its Computation Edited by Christian Rohrer Antje Rodeutscher and Hans Kamp c Copyright CSLI Publications S Schulte i W H Schmid M Rooth S Riezler D Prescher maximise the probability of the data The linguistic value of the grammar mo dels mainly lies in the lex icalised mo del parameters they contain lexicalised rules ie grammar rules referring to a sp ecic lexical head and lexical choice parameters a measure of lexical coherence b etween lexical heads Concerning verbs for example the lexical rule parameters serve as basis for probability distri butions over sub categorisation frames and the lexical choice parameters supply us with nominal heads of sub categorised noun phrases as basis for selectional constraints The information can b e used straightly as lex ical description or as input for lexicon to ols such as semantic clustering techniques Ro oth et al Schulte im Walde a or as basis for a variety of applications eg parser improvement Riezler et al chunking Schmid and Schulte im Walde or machine transla tion Prescher et al The reader might still wonder ab out the exact nature of the lexi cal information we gain Consider this concrete example our trained grammar mo del for German informs you that the verb essen eat most probably o ccurs transitively but might as well o ccur intransitively In addition we learn that eg the most frequent nominal heads in the di rect ob ject slot of the transitive frame are the German equivalent nouns for bread meat banana and icecream The rst part of this chapter concerns the grammar development and its training section allows practical insights into the prerequi sites for our statistical grammars and describ es a characteristic grammar development pro cess by means of the German grammar Following in section the reader will nd an intro duction to the theoretical back ground of statistical grammars and their headlexicalised renements as well as a description of their training facilities Section then presents the application of the training pro cedure concerning the German gram mar example The second part of this chapter illustrates various p ossibilities to ex ploit the lexicalised probability mo dels section straightly utilises the mo del parameters to extract lexical parameters for mainly verbs and to apply sp ecic parsing facilities such as Viterbi parsing or noun chunking Section demonstrates the usage of lexical information with sp ecic reference to lexical coherence b etween verbs and sub cate gorised nouns as input for semantic clustering techniques Grammar Development Our statistical grammar mo dels can b e develop ed for arbitrary lan guages presupp osing i a corpus as source for empirical input data Statistical Grammar Models and Lexicon Acquisition ii a morphological analyser for analysing the corpus wordforms and assigning lemmas where appropriate and iii a contextfree grammar CFG for parsing the corpus data The grammar is supp osed to cover a sucient part of the corpus since in order to develop a statistical grammar mo del on basis of the grammar cf sections and a large amount of structural rela tions within parses is required The more corpus data is accessible for grammar training the more reliable the probability mo del will b e As mentioned in the intro duction manual work concerning the gram mar is reduced to a minimum The necessary grammars need not go into detailed structures for the relevant grammar asp ects to b e trained suf ciently The complete framework can b e set up within a few weeks time and easily b e transferred to a dierent language This prop erty advances the grammar framework compared to eg treebank gram mars Charniak since it do es not presupp ose a treebank for the relevant language So far we have worked on statistical grammar mo dels for En glish Carroll and Ro oth German an earlier version is de scrib ed in Beil et al Portuguese de Lima and Chi nese Ho ckenmaier The preparation of the relevant corpus data the task denition of the morphological analyser and a contextfree grammar are describ ed b elow For the purp ose of illustrating the gram mar development framework we concentrate on the German mo del We sp ecically describ e the grammar development facilities and outline the grammar structure Corpus Preparation We created two subcorp ora from the million token newspap er cor pus Huge German Corpus HGC a a subcorpus containing verbnal clauses with a total of million words and b a subcorpus containing million relative clauses with a total of million words Apart from nonnite clauses as verbal arguments there are no further clausal emb eddings and the clauses do not contain any punctuation ex cept for a terminal p erio d The average clause length is and words p er clause resp ectively Morphological Analyser We utilised a nitestate morphology Schiller and St ckert to as sign multiple morphological features such as partofsp eech tag case gender and numb er to the corpus words partly collapsed to reduce the numb er of analyses For example the word Bleibe either the case am biguous feminine singular noun residence or a p erson and mo de am S Schulte i W H Schmid M Rooth S Riezler D Prescher biguous nite singular present tense verb form of stay is analysed as follows analyse Bleibe BleibeNNFemAkkSg BleibeNNFemDatSg BleibeNNFemGenSg BleibeNNFemNomSg bleibenVSgPresInd bleibenVSgPresKonj bleibenVSgPresKonj Reducing the ambiguous categories leaves the two morphological analy ses Bleibe NNFemCasSg VVFIN Apart from assigning morphological analyses the to ol in addition serves as lemmatiser cf Schulze The German ContextFree Grammar The contextfree grammar contains rules with their heads marked With very few exceptions rules for co ordination Srule the rules do not have more than two daughters The terminal categories in the grammar corresp ond to the collapsed corpus tags assigned by the mor phology Grammar development is facilitated by a grammar development envi ronment of the featurebased grammar formalism YAP Schmid and b a chart browser that p ermits a quick and ecient discovery of grammar bugs Carroll Figure shows that the ambiguity in the chart is quite considerable even though grammar and corpus are restricted The grammar covers of the verbnal and of the rel ative clauses ie the resp ective part of the corp ora are assigned parses The following sections describ e two essential parts of the gram mar the noun chunks and the denition of sub categorisation frames For more details concerning the German grammar structure see Schulte im Walde b Noun Chunks On nominal categories in addition to the four cases Nom Gen Dat and Akk case features with a disjunctive interpretation such as Dir for Nom or Akk are used The grammar is written in such a way that non disjunctive features are intro duced high up in the tree Figures to illustrate the use of disjunctive features in the noun pro jections for the Statistical Grammar Models and Lexicon Acquisition FIGURE Chart Browser for Grammar Development German noun phrase eine gute Gelegenheit a go o d opp ortunity in all four cases the terminal NN contains the fourway ambiguous Cas case feature the Nbar NN and noun chunk NC pro jections disambiguate to twoway ambiguous case features Dir and Obl the weakstrong SwSt feature of NN allows or prevents combination with a determiner re sp ectively only at the noun phrase NP pro jection level the case feature app ears in disambiguated form The use of disjunctive case features re sults in some reduction in the size of the parse forest Essentially the full range of agreement inside the noun phrase is enforced Agreement b etween the sub ject NP and the tensed verb is not enforced by the gram mar in order to control the numb er of parameters and rules The noun chunk denition refers to Abneys chunk grammar or ganisation Abney the noun chunk NC is a pro jection that excludes p osthead complements and adverbial adjuncts intro duced higher than prehead mo diers and determiners but includes participial premo diers with their complements S Schulte i W H Schmid M Rooth S Riezler D Prescher NPNom NCDir ARTE NNFemDirSw ARTIndefE ADJE NNFemDirSw eine ADJE NNFemCasSg gute Gelegenheit FIGURE Noun Pro jection
Recommended publications
  • Creating Words: Is Lexicography for You? Lexicographers Decide Which Words Should Be Included in Dictionaries. They May Decide T
    Creating Words: Is Lexicography for You? Lexicographers decide which words should be included in dictionaries. They may decide that a word is currently just a fad, and so they’ll wait to see whether it will become a permanent addition to the language. In the past several decades, words such as hippie and yuppie have survived being fads and are now found in regular, not just slang, dictionaries. Other words, such as medicare, were created to fill needs. And yet other words have come from trademark names, for example, escalator. Here are some writing options: 1. While you probably had to memorize vocabulary words throughout your school years, you undoubtedly also learned many other words and ways of speaking and writing without even noticing it. What factors are bringing about changes in the language you now speak and write? Classes? Songs? Friends? Have you ever influenced the language that someone else speaks? 2. How often do you use a dictionary or thesaurus? What helps you learn a new word and remember its meaning? 3. Practice being a lexicographer: Define a word that you know isn’t in the dictionary, or create a word or set of words that you think is needed. When is it appropriate to use this term? Please give some sample dialogue or describe a specific situation in which you would use the term. For inspiration, you can read the short article in the Writing Center by James Chiles about the term he has created "messismo"–a word for "true bachelor housekeeping." 4. Or take a general word such as "good" or "friend" and identify what it means in different contexts or the different categories contained within the word.
    [Show full text]
  • The History of the Creation of Lexicographic Dictionaries, Theoretical and Practical Ways of Development
    European Journal of Research Development and Sustainability (EJRDS) Available Online at: https://www.scholarzest.com Vol. 2 No. 3, March 2021, ISSN: 2660-5570 THE HISTORY OF THE CREATION OF LEXICOGRAPHIC DICTIONARIES, THEORETICAL AND PRACTICAL WAYS OF DEVELOPMENT Dilrabo Askarovna Ubaidova (Bukhara State University) Dilfuza Kamilovna Ergasheva (Bukhara State University) Article history: Abstract: Received: 20th February 2021 The article provides a historical analysis of the development of ideas about Accepted: 2th March 2021 lexicography in Russian linguistics. The authors come to reasonable conclusions Published: 20th March 2021 that 1) the term "lexicography" appeared in scientific and general use in the last third of the 19th century; 2) the content of the concept brought under this term developed in the direction from the applied aspect of this linguistic essence to the theoretical aspect and the totality of dictionaries of the given language; 3) in the last quarter of the XX century. lexicography is firmly entrenched in the science of language with the status of an autonomous branch of linguistics; 4) recently, she began to receive, in addition to the definition, a certain wider set of attributes. Keywords: vocabulary, lexicography, lexicology, lexicon, linguistic term, vocabulary practice, applied aspect, dictionaries, sociolexicography, typology of dictionaries As you know, the practice of compiling various kinds of dictionaries has a much longer history than linguistics as a science. Suffice it to recall Nighwanta, Amarakosa in Ancient India, Dictionaries of the Turkic languages of Mahmud Kozhgariy, Comparative dictionaries of all languages and dialects of Peter Pallas, etc. However, the theoretical understanding of this practice came to linguistics much later.
    [Show full text]
  • Lexicology and Lexicography
    LEXICOLOGY AND LEXICOGRAPHY 1. GENERAL INFORMATION 1.1.Study programme M.A. level (graduate) 1.6. Type of instruction (number of hours 15L + 15S (undergraduate, graduate, integrated) L + S + E + e-learning) 1.2. Year of the study programme 1st & 2nd 1.7. Expected enrollment in the course 30 Lexicology and lexicography Marijana Kresić, PhD, Associate 1.3. Name of the course 1.8. Course teacher professor 1.4. Credits (ECTS) 5 1.9. Associate teachers Mia Batinić, assistant elective Croatian, with possible individual 1.5. Status of the course 1.10. Language of instruction sessions in German and/or English 2. COURSE DESCRIPTION The aims of the course are to acquire the basic concepts of contemporary lexicology and lexicography, to become acquainted with its basic terminology as well as with the semantic and psycholinguistic foundations that are relevant for understanding problems this field. The following topics will be covered: lexicology and lexicography, the definition of 2.1. Course objectives and short words, word formation, semantic analysis, analysis of the lexicon, semantic relations between words (hyperonomy, contents hyponomy, synonymy, antonymy, homonymy, polysemy, and others), the structure of the mental lexicon, the micro- and macro structure of dictionaries, different types of dictionaries. Moreover, students will be required to conduct their own lexicographic analysis and suggest the lexicographic design of a selected lexical unit. 2.2. Course enrolment requirements No prerequisites. and entry competences required for the course
    [Show full text]
  • Introduction to Wordnet: an On-Line Lexical Database
    Introduction to WordNet: An On-line Lexical Database George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine Miller (Revised August 1993) WordNet is an on-line lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, and adjectives are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets. Standard alphabetical procedures for organizing lexical information put together words that are spelled alike and scatter words with similar or related meanings haphazardly through the list. Unfortunately, there is no obvious alternative, no other simple way for lexicographers to keep track of what has been done or for readers to ®nd the word they are looking for. But a frequent objection to this solution is that ®nding things on an alphabetical list can be tedious and time-consuming. Many people who would like to refer to a dictionary decide not to bother with it because ®nding the information would interrupt their work and break their train of thought. In this age of computers, however, there is an answer to that complaint. One obvious reason to resort to on-line dictionariesÐlexical databases that can be read by computersÐis that computers can search such alphabetical lists much faster than people can. A dictionary entry can be available as soon as the target word is selected or typed into the keyboard. Moreover, since dictionaries are printed from tapes that are read by computers, it is a relatively simple matter to convert those tapes into the appropriate kind of lexical database.
    [Show full text]
  • Automatic Labeling of Troponymy for Chinese Verbs
    Automatic labeling of troponymy for Chinese verbs 羅巧Ê Chiao-Shan Lo*+ s!蓉 Yi-Rung Chen+ [email protected] [email protected] 林芝Q Chih-Yu Lin+ 謝舒ñ Shu-Kai Hsieh*+ [email protected] [email protected] *Lab of Linguistic Ontology, Language Processing and e-Humanities, +Graduate School of English/Linguistics, National Taiwan Normal University Abstract 以同©^Æ與^Y語意關¶Ë而成的^Y知X«,如ñ語^² (Wordnet)、P語^ ² (EuroWordnet)I,已有E分的研v,^²的úË_已øv完善。ú¼ø同的目的,- 研b語言@¦已úË'規!K-文^Y²路 (Chinese Wordnet,CWN),è(Ð供完t的 -文­YK^©@分。6而,(目MK-文^Y²路ûq-,1¼目M;要/¡(ºº$ 定來標記同©^ÆK間的語意關Â,因d這些標記KxÏ尚*T成可L應(K一定規!。 因d,,Ç文章y%針對動^K間的上下M^Y語意關 (Troponymy),Ðú一.ê動標 記的¹法。我們希望藉1句法上y定的句型 (lexical syntactic pattern),úË一個能 ê 動½取ú動^上下M的ûq。透N^©意$定原G的U0,P果o:,dûqê動½取ú 的動^上M^,cº率將近~分K七A。,研v盼能將,¹法應(¼c(|U-的-文^ ²ê動語意關Â標記,以Ê知X,體Kê動úË,2而能有H率的úË完善的-文^Y知 XÇ源。 關關關uuu^^^:-文^Y²路、語©關Âê動標記、動^^Y語© Abstract Synset and semantic relation based lexical knowledge base such as wordnet, have been well-studied and constructed in English and other European languages (EuroWordnet). The Chinese wordnet (CWN) has been launched by Academia Sinica basing on the similar paradigm. The synset that each word sense locates in CWN are manually labeled, how- ever, the lexical semantic relations among synsets are not fully constructed yet. In this present paper, we try to propose a lexical pattern-based algorithm which can automatically discover the semantic relations among verbs, especially the troponymy relation. There are many ways that the structure of a language can indicate the meaning of lexical items. For Chinese verbs, we identify two sets of lexical syntactic patterns denoting the concept of hypernymy-troponymy relation.
    [Show full text]
  • Short-Text Clustering Using Statistical Semantics
    Short-Text Clustering using Statistical Semantics Sepideh Seifzadeh Ahmed K. Farahat Mohamed S. Kamel University of Waterloo University of Waterloo University of Waterloo Waterloo, Ontario, Canada. Waterloo, Ontario, Canada. Waterloo, Ontario, Canada. N2L 3G1 N2L 3G1 N2L 3G1 [email protected] [email protected] [email protected] Fakhri Karray University of Waterloo Waterloo, Ontario, Canada. N2L 3G1 [email protected] ABSTRACT 1. INTRODUCTION Short documents are typically represented by very sparse In social media, users usually post short texts. Twitter vectors, in the space of terms. In this case, traditional limits the length of each Tweet to 140 characters; therefore, techniques for calculating text similarity results in measures developing data mining techniques to handle the large vol- which are very close to zero, since documents even the very ume of short texts has become an important goal [1]. Text similar ones have a very few or mostly no terms in common. document clustering has been widely used to organize doc- In order to alleviate this limitation, the representation of ument databases and discover similarity and topics among short-text segments should be enriched by incorporating in- documents. Short text clustering is more challenging than formation about correlation between terms. In other words, regular text clustering; due to the sparsity and noise, they if two short segments do not have any common words, but provide very few contextual clues for applying traditional terms from the first segment appear frequently with terms data mining techniques [2]; therefore, short documents re- from the second segment in other documents, this means quire different or more adapted approaches.
    [Show full text]
  • The Art of Lexicography - Niladri Sekhar Dash
    LINGUISTICS - The Art of Lexicography - Niladri Sekhar Dash THE ART OF LEXICOGRAPHY Niladri Sekhar Dash Linguistic Research Unit, Indian Statistical Institute, Kolkata, India Keywords: Lexicology, linguistics, grammar, encyclopedia, normative, reference, history, etymology, learner’s dictionary, electronic dictionary, planning, data collection, lexical extraction, lexical item, lexical selection, typology, headword, spelling, pronunciation, etymology, morphology, meaning, illustration, example, citation Contents 1. Introduction 2. Definition 3. The History of Lexicography 4. Lexicography and Allied Fields 4.1. Lexicology and Lexicography 4.2. Linguistics and Lexicography 4.3. Grammar and Lexicography 4.4. Encyclopedia and lexicography 5. Typological Classification of Dictionary 5.1. General Dictionary 5.2. Normative Dictionary 5.3. Referential or Descriptive Dictionary 5.4. Historical Dictionary 5.5. Etymological Dictionary 5.6. Dictionary of Loanwords 5.7. Encyclopedic Dictionary 5.8. Learner's Dictionary 5.9. Monolingual Dictionary 5.10. Special Dictionaries 6. Electronic Dictionary 7. Tasks for Dictionary Making 7.1. Panning 7.2. Data Collection 7.3. Extraction of lexical items 7.4. SelectionUNESCO of Lexical Items – EOLSS 7.5. Mode of Lexical Selection 8. Dictionary Making: General Dictionary 8.1. HeadwordsSAMPLE CHAPTERS 8.2. Spelling 8.3. Pronunciation 8.4. Etymology 8.5. Morphology and Grammar 8.6. Meaning 8.7. Illustrative Examples and Citations 9. Conclusion Acknowledgements ©Encyclopedia of Life Support Systems (EOLSS) LINGUISTICS - The Art of Lexicography - Niladri Sekhar Dash Glossary Bibliography Biographical Sketch Summary The art of dictionary making is as old as the field of linguistics. People started to cultivate this field from the very early age of our civilization, probably seven to eight hundred years before the Christian era.
    [Show full text]
  • Distributional Semantics
    Distributional semantics Distributional semantics is a research area that devel- by populating the vectors with information on which text ops and studies theories and methods for quantifying regions the linguistic items occur in; paradigmatic sim- and categorizing semantic similarities between linguis- ilarities can be extracted by populating the vectors with tic items based on their distributional properties in large information on which other linguistic items the items co- samples of language data. The basic idea of distributional occur with. Note that the latter type of vectors can also semantics can be summed up in the so-called Distribu- be used to extract syntagmatic similarities by looking at tional hypothesis: linguistic items with similar distributions the individual vector components. have similar meanings. The basic idea of a correlation between distributional and semantic similarity can be operationalized in many dif- ferent ways. There is a rich variety of computational 1 Distributional hypothesis models implementing distributional semantics, includ- ing latent semantic analysis (LSA),[8] Hyperspace Ana- The distributional hypothesis in linguistics is derived logue to Language (HAL), syntax- or dependency-based from the semantic theory of language usage, i.e. words models,[9] random indexing, semantic folding[10] and var- that are used and occur in the same contexts tend to ious variants of the topic model. [1] purport similar meanings. The underlying idea that “a Distributional semantic models differ primarily with re- word is characterized by the company it keeps” was pop- spect to the following parameters: ularized by Firth.[2] The Distributional Hypothesis is the basis for statistical semantics. Although the Distribu- • tional Hypothesis originated in linguistics,[3] it is now re- Context type (text regions vs.
    [Show full text]
  • LAL 631 | Lexicology and Lexicography
    Course Outline | Spring Semester 2016 LAL 631 | Lexicology and Lexicography Optional Course for the concentration track Course Teacher: Dr. Hassan Hamzé Credit Value: 3 Pre-requisites: None Co-requisites: LAL 612 Course Duration: 14 weeks; Semester 2 Total Student Study Time: 126 hours, including 42 contact hours (lectures and seminars). AIMS This course presents the core elements of lexicology and lexicography with the view to using this knowledge for writing modern Arabic literature. The course aims to give students the following: a. Essential knowledge in lexicology. b. Essential knowledge in lexicography. c. Necessary basic skills to use this knowledge to write modern Arabic dictionaries. d. Necessary basic skills to use this knowledge for the Doha Historical Dictionary of the Arabic Language. The knowledge and skills gained through the course will be applied to: • General theoretical principles of lexicology, including: − The word and the lexical unit − Lexical semantics and meaning − Shared meaning, synonymy, and polysemy − Inflection and semantics: derivation, word formation, and syntax. • General principles of lexicography, including: − Building a corpus − Lexical processing − Kinds of dictionaries: general/specialist, linguistic/encyclopedic − Special features of the historical dictionary INTENDED LEARNING OUTCOMES In line with the program’s efforts to produce graduates qualified to carry out world-class academic research in the fields of linguistics and lexicography using cross-disciplinary methods, this course will equip students with skills of scientific and critical analysis. Preparing graduates to use their knowledge and research expertise to meet the needs of the Arab region in the field of Arabic lexicography, this course will see students acquire advanced competence in academic research, so that they are able to deal with lexical issues using the latest theories and methods.
    [Show full text]
  • Does Johnson's Prescriptive Approach Still Have a Role to Play * in Modern-Day Dictionaries? Rufus H
    http://lexikos.journals.ac.za doi: 10.5788/20-0-141 Does Johnson's Prescriptive Approach Still Have a Role to Play * in Modern-Day Dictionaries? Rufus H. Gouws ([email protected]) and Liezl Potgieter ([email protected]), Department of Afrikaans and Dutch, Stellenbosch University, Stellenbosch, South Africa Abstract: Samuel Johnson's dictionary (1755) confirmed both the status of dictionaries as authoritative sources of (linguistic) knowledge and the prescriptive approach in lexicography. This approach prevailed for a long time. During the last decades the descriptive approach came to the fore, aptly supported by the increased reliance on lexicographic corpora. Modern-day lexicography has also witnessed the introduction of a third approach, i.e. the proscriptive approach, which includes features of both the prescriptive and the descriptive approach. This article investigates the occurrence of the prescriptive, descriptive and proscriptive approaches in modern-day dictionaries. A distinction is made between dictionaries focusing on language for general purposes and diction- aries focusing on languages for special purposes. It is shown that users rely on dictionaries as pre- scriptive reference sources and expect lexicographers to provide them with an answer to the spe- cific question that prompted the dictionary consultation process. It is argued that knowledgeable dictionary users must be able to achieve an unambiguous retrieval of information and must be able to rely on the dictionary to satisfy their specific cognitive or communicative needs. Here the pro- scriptive approach plays an important role. Keywords: COGNITIVE FUNCTION, COMMUNICATION FUNCTION, CULTURE- DEPENDENT, DESCRIPTIVE, EXACT PROSCRIPTION, EXCLUSIVE PROSCRIPTION, LGP DICTIONARIES, LSP DICTIONARIES, NON-RECOMMENDED FORM, PRESCRIPTIVE, PRO- SCRIPTIVE, RECOMMENDATION, TYPES OF USERS, USER PERSPECTIVE.
    [Show full text]
  • A Functional Approach to the Choice Between Descriptive, Prescriptive
    A Functional Approach to the Choice between Descriptive, Prescriptive and Proscriptive Lexicography Henning Bergenholtz, Centre for Lexicography, Aarhus School of Business, University of Aarhus, Aarhus, Denmark ([email protected]) and Rufus H. Gouws, Department of Afrikaans and Dutch, University of Stellenbosch, Stellenbosch, South Africa ([email protected]) Abstract: In lexicography the concepts of prescription and description have been employed for a long time without there ever being a clear definition of the terms prescription/prescriptive and description/descriptive. This article gives a brief historical account of some of the early uses of these approaches in linguistics and lexicography and argues that, although they have primarily been interpreted as linguistic terms, there is a need for a separate and clearly defined lexicographic application. Contrary to description and prescription, the concept of proscription does not have a linguistic tradition but it has primarily been introduced in the field of lexicography. Different types of prescription, description and proscription are discussed with specific reference to their potential use in dictionaries with text reception and text production as functions. Preferred approaches for the different functions are indicated. It is shown how an optimal use of a prescriptive, descriptive or proscriptive approach could be impeded by a polyfunctional dictionary. Consequently argu- ments are given in favour of monofunctional dictionaries. Keywords: COGNITIVE FUNCTION, COMMUNICATION FUNCTION, DESCRIPTION, DESCRIPTIVE, ENCYCLOPAEDIC, FUNCTIONS, MONOFUNCTIONAL, POLYFUNCTIONAL, PRESCRIPTION, PRESCRIPTIVE, PROSCRIPTION, PROSCRIPTIVE, SEMANTIC, TEXT PRO- DUCTION, TEXT RECEPTION Opsomming: 'n Funksionele benadering tot die keuse tussen deskriptief, preskriptief en proskriptief in die leksikografie. In die leksikografie is die begrippe preskripsie en deskripsie lank gebruik sonder dat daar 'n duidelike definisie van die terme preskrip- sie/preskriptief en deskripsie/deskriptief was.
    [Show full text]
  • The Neat Summary of Linguistics
    The Neat Summary of Linguistics Table of Contents Page I Language in perspective 3 1 Introduction 3 2 On the origins of language 4 3 Characterising language 4 4 Structural notions in linguistics 4 4.1 Talking about language and linguistic data 6 5 The grammatical core 6 6 Linguistic levels 6 7 Areas of linguistics 7 II The levels of linguistics 8 1 Phonetics and phonology 8 1.1 Syllable structure 10 1.2 American phonetic transcription 10 1.3 Alphabets and sound systems 12 2 Morphology 13 3 Lexicology 13 4 Syntax 14 4.1 Phrase structure grammar 15 4.2 Deep and surface structure 15 4.3 Transformations 16 4.4 The standard theory 16 5 Semantics 17 6 Pragmatics 18 III Areas and applications 20 1 Sociolinguistics 20 2 Variety studies 20 3 Corpus linguistics 21 4 Language and gender 21 Raymond Hickey The Neat Summary of Linguistics Page 2 of 40 5 Language acquisition 22 6 Language and the brain 23 7 Contrastive linguistics 23 8 Anthropological linguistics 24 IV Language change 25 1 Linguistic schools and language change 26 2 Language contact and language change 26 3 Language typology 27 V Linguistic theory 28 VI Review of linguistics 28 1 Basic distinctions and definitions 28 2 Linguistic levels 29 3 Areas of linguistics 31 VII A brief chronology of English 33 1 External history 33 1.1 The Germanic languages 33 1.2 The settlement of Britain 34 1.3 Chronological summary 36 2 Internal history 37 2.1 Periods in the development of English 37 2.2 Old English 37 2.3 Middle English 38 2.4 Early Modern English 40 Raymond Hickey The Neat Summary of Linguistics Page 3 of 40 I Language in perspective 1 Introduction The goal of linguistics is to provide valid analyses of language structure.
    [Show full text]