Computational Morphology

COMPUTATIONAL MORPHOLOGY Radhika Mamidi Winter workshop/school on Machine Learning and Text Analytics 17 December, 2013 Outline What is Computational Morphology? Morphology concepts Computational approaches Morphological Analysis Morphological Generator Computational Approaches Paradigm based Finite State based Statistical based Issues to be handled Computational Morphology Linguistics: Scientific study of language at these levels Sound – Phonetics and Phonology Word - Morphology Sentence - Syntax Discourse – Discourse Analysis Computational Linguistics: Processing and producing language computationally. Computational Morphology: Processing and producing language computationally at WORD LEVEL Why are we studying morphology? The knowledge of words will help us process language computationally at word level. Knowledge of words include word structure and word formation rules. This knowledge will help us in developing NLP tools like “Morphological Analyzers” and “Morphological Generators” These are important in Information Retrieval, Machine Translation, Spell checking, etc. What’s Morphology? The study of word structure The study of the mental dictionary: How are words stored in the mind? What is a possible word? Example: (i) At the supermarket, the girls bought pink cheeriots and the boys blue fistings. (ii) When their mother signaled, the girls barried home unhappily. The words ‘cheeriots’, ‘fistings’ and ‘barried’ do not exist. However, assuming they are valid words of English, we ‘guess’ the meaning by context and the position of the word in the given sentence. We do this using our general knowledge and linguistic knowledge. At the supermarket, the girls bought pink cheeriots and the boys blue fistings. Part of speech = nouns [comes after adjectives] [-s ending] = more than one Meaning = some objects that have color [clue: supermarket] = some object that is sold; perhaps a toy The word forms are more like toys, balls, ribbons. When their mother signaled, the girls barried home unhappily. Part of Speech = verb [position] [ends in –ed] = past tense Base form = barry Meaning = go The word form is more like carried, married. What’s the Longest Word of English? Could it be ismestablishmentariandisanti ? Why not when antidisestablishmentarianism is possible. Possible words with anti and missile: anti-missile (adjective) anti-missile missile: a missile used for anti-missile purposes anti- anti-missile missile missile: a missile used against anti- missile missiles antiNmissileN+1, where N can go until…. There is a systematic way of word formation. Eg: happy, -ness, un- This is called Morphotactics. Morphemes Have a sound [form] and a meaning: Example: “cats” /kaet/ “four-legged animal” /-s/ “plural number” Even though /-s/ has a sound and a meaning, it can’t mean “plural” by itself… It has to attach to a noun. “A morpheme is the smallest unit of word form that has meaning” Examples: cats = cat + -s girlish = girl + -ish unfriendly = un- + friend + -ly cat, -s, girl, -ish, un-, friend, -ly are morphemes Even Bush knows morphology (…though he may use it differently than the rest of us) The war on terrorism has transformationed the US-Russia Relationship We’re working to help Russia securitize the dismantled warheads The explorationists are only willing to help move equipment during the winter This case has had full analyzation and has been looked at a lot Compositionality “Explorationists” explore: to spend an extended effort looking around a particular area - X -ation: can attach to Verbs, the process of Xing - Y -ist: can attach to Nouns, one who performs an action Y - Z -s: attaches to Nouns, more than one Z explorationists: a compositional word Fully compositional meaning is based on its parts Eg: computer, impossible, headache Non-compositionality “Inflect” Is inflect morphologically complex? It contains more than one morpheme. What do in- and flect mean? This is a case of a non-compositional meaning. In explorationists, if you know the meaning of the parts, you know the meaning of the whole. Not necessarily so for inflect. Non-compositional meaning cannot be derived from its parts. Lexical/Content words Words which are not function words are called content words or lexical words: these include nouns, verbs, adjectives, and most adverbs, though some adverbs are function words (e.g. then, why). They belong to open class. Dictionaries define the specific meanings of content words, but can only describe the general usages of function words. By contrast, grammars describe the use of function words in detail, but have little interest in lexical words. Function words Function words or grammatical words are words that have little lexical meaning or have ambiguous meaning, but instead serve to express grammatical relationships with other words within a sentence. Function words may be prepositions, pronouns, auxiliary verbs, conjunctions, grammatical articles or particles, all of which belong to the group of closed class words. To know about morpheme we should know about…. Free morphemes vs. Bound morphemes Lexical morphemes vs. Functional morphemes Null/Zero morpheme Inflectional morphemes vs. Derivational morphemes Root morphemes vs. Affix morphemes Free vs Bound morphemes electr- and tox- have isolable meanings in electric, electrify, toxic, (de-)toxify But they cannot be pronounced on their own: they are bound morphemes girl and book have isolable meanings in girls, girlish, books, booked, booking They can occur on their own: they are free morphemes Are prefixes and suffixes bound morphemes? Lexical morpheme free morphemes: apple, smart, book, slow, eat, write They can exist on their own as independent words. bound morphemes: -ceive, -ject, cran-, -ship, un-, dis- They cannot be used independently. They need another morpheme [free or bound] to form a word. Eg: re-ceive, con-ceive, sub-ject, pro-ject, cran-berry, scholar-ship, fellow-ship, un-kind, dis-obey Functional morphemes free morphemes: of, with, she, it, and, although, however, because, then bound morphemes: -s, -ed, -ing, Four-way contrasts Lexical, Free: Nouns, Verbs, Adj, Adv cat, town, call, house, hall, smart, fast Lexical, Bound: including derivational affixes rasp- [raspberry], cran- [cranberry] , -ceive [conceive, receive], un-, re-, pre- Functional, Free: Prepositions, Articles, Conj with, at, and, an, the, because Functional, Bound: inflectional affixes -s, -ed, -ing [eats, walked, laughing] Exercise 1: Identify the free and bound morphemes in the following words walked, talked, danced, arrived playhouse, watchdog, football player drinking, playing, eating import, export, transport raspberry, cranberry invert, convert, divert books, pens, boards writer, caretaker, rider, fighter Can the following words be decomposed? delight, news, traitor, bed, evening Exercise 2: Identify the lexical and functional morphemes in the following words. Mention if they are free or bound. politically beautiful between writing raspberries unable nationalization Inflection vs. Derivation Derivational affixes: allow us to make new words that alters the meaning. There is an error in the computation. [computeV – computationN] {Nominalisation} It is a computational approach. [computationN – computationalAdj] {Adjectivization} Inflectional suffixes: required in order to make the sentence grammatical Inflected words belong to the same class *Yesterday I walk to class [walkV – walkedV] *I like all my student [studentN – studentsN] Inflectional Morphology Examples: [the POS remains the same] VERBS EAT = eat, eats, ate, eaten, eating DRINK = drink, drinks, drank, drunk, drinking PLAY = play, plays, played, played, playing 0, -s, -ed, -en, -ing are inflectional morphemes NOUNS PLAY = play, plays GIRL = girl, girls SHEEP = sheep, sheep -s, 0 are inflectional morphemes Derivational morphology Two types: May change the category {N,V,A,Adv} driveV +er = driverN eatv + able = eatableadj girlN + ish = girlishadj disturb V + ance = disturbance N Doesn’t have to change the category un + doV = undoV re+fryv = refryv un+happyAdj = unhappyAdj Derivational – more examples Verbs eat – eatable [adj], eatables [noun] drink – drinking [noun] play – player [noun] -able, -ing, -er are derivational morphemes Nouns play – playful [adj], replay [verb] girl – girlish [adj], girlhood [noun] sheep – sheepish[adj] -ful, re-, -ish, -hood are derivational morphemes Exercise 3 Each of the words below contains two morphemes – a root and a derivational affix. Decide if the derivational affix changes only the meaning or the class of the root as well. rewrite hopeless happily national unclear creation happiness unhappy helpful undo Null/Zero morpheme A null morpheme is a morpheme that is realized by a phonologically null affix (an empty string of phonological segments) A null morpheme is an "invisible" affix. It's also called zero morpheme; the process of adding a null morpheme is called null affixation. Examples cat = cat + -0 = ROOT("cat") + SINGULAR cats = cat + -s = ROOT("cat") + PLURAL sheep = sheep + -0 = ROOT(“sheep") + SINGULAR sheep = sheep + -0 = ROOT(“sheep") + PLURAL More examples darken[verb] = dark [adj] + -en Meaning = make more ‘Adjective’ redden [verb] = red + -en [make more Red] yellow [verb] = yellow + 0 [make more yellow] brown [verb] = brown + 0 [make more brown] blacken [verb] = black + -en [make more black] Root Morphemes vs Affix morphemes Root morphemes are morphemes around which larger words are

Computational Morphology

Representation of Inflected Nouns in the Internal Lexicon

Reduplication in Distributed Morphology

University of California Santa Cruz Minimal Reduplication

Morphological Sources of Phonological Length

Morphological Well-Formedness As a Derivational Constraint on Syntax: Verbal Inflection in English1

JOSAR, Vol. 2 No. 1 March, 2017; P-ISSN: 2502-8251

Lexical Level

On Zero Agreement and Polysynthesis* Mark C. Baker Rutgers University

MORPHOLOGY What Is Morphology?

Inferring Morphological Rules from Small Examples Using 0/1 Linear Programming

On Null Causativization the Problem: Much Recent Literature (Pesetsky 1995, Harley 1995, Pylkkänen 2002, Alexiadou Et Al

Getting Rid of Number Features