NetWordS 2015

Word Knowledge and Word Usage Representations and Processes in the Mental Lexicon

March 30th - April 1st, 2015 Scuola Normale Superiore, Pisa - Italy CONFERENCE PROCEEDINGS http://www.networds-esf.eu/

ComPhys Istituto di Linguistica Computazionale physiology of communication

Vito Pirrelli, Claudia Marzi, and Marcello Ferro (eds.)

NetWordS 2015

Word Knowledge and Word Usage Representations and Processes in the Mental Lexicon

Pisa, Italy, March 30th - April 1st, 2015 Conference proceedings

Acknowledgements

The international conference “Word Knowledge and Word Usage: Representations and pro- cesses in the mental lexicon” was supported by the European Science Foundation Standing Committee for the Humanities within the framework of the NetWordS ERP programme (May 2011 - April 2015).

Copyright c 2015 for the individual papers by the papers’ authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors.

Editors’ address: Consiglio Nazionale delle Ricerche Istituto di Linguistica Computazionale via G. Moruzzi, 1 56124 Pisa, Italy

vito.pirrelli, claudia.marzi, marcello.ferro @ilc.cnr.it { }

189 Table of contents 3 Earlier findings et al. (2014) found, namely that young children show a first-mention bias that is too slow to de- Foreword ...... 1 According to Järvikivi et al. (2013), German 4- tect, or it may simply show that 3-year-olds are year-olds and adults show a subject preference too young to comprehend cleft-sentences. In any regardless of which word the it-cleft focuses on. case, this shows that older children have a Invited talks ...... 2 Moreover, children seem to show a weaker sub- stronger preference for the focused referent than Wolfgang U. Dressler ject preference than adults. We expect similar younger children do. Psycholinguistic illusions in and on morphology ...... 2 results from our data. Adults showed an overall subject preference re- Gabriella Vigliocco Hartshorne et al. (2014) discovered that 2- to 3- gardless of sentence type, except in the condition The bridge of iconicity: from a world of experience to experience of language . . . . 2 year-olds have a first-mention preference that with object-cleft and no depicted action. This seldom is detected because they take longer to appears to be the only condition that weakens Michael Zock process. We thus expect young children to show their subject preference, probably because it Needles in a haystack and how to find them. Can neuroscientists, psychologists a preference for subject and/or first-mentioned leaves the subject without syntactic focus and and computational linguist help us (to build a tool) to overcome the “tip of the tongue” character , albeit at a later time window, whereas with no visual support. Thus, the effect of syn- problem? ...... 2 adults will show an earlier preference than chil- tactic focus and/or a first-mention preference dren. emerges here. Marta Kutas Content and organization of knowledge and its use in language comprehension . . 3 Bittner and Kuehnast (2011) have found that Moreover, depicted action seems to have dis- German 3-year-olds rely more on context-cues tracted the adults, since the effect of subject vs. Extended abstracts ...... 4 than older German children, who more often use object-clefts offline was only found when the syntax-cues. We thus expect that young children action was not depicted. Olivier Bonami and Sacha Beniamine will be more influenced by the presence of visual Implicative structure and joint predictiveness ...... 4 context, whereas older children will be more sen- In subject-clefts as opposed to object-clefts, 5- sitive to syntactically expressed focus. and 7-year-olds displayed an online subject pref- Emmanuel Keuleers, Paweł Mandera, Michael¨ Stevens, and Marc Brysbaert erence, although in different manners. Adults Of crowds and corpora: a marriage of measures ...... 10 4 Results also showed this preference, both offline and online. Hence, all these three age groups appear Reza Falahati and Chiara Bertini A mixed design ANOVA showed that 5-year- to use syntax cues, but adults seem to be more Perception of gesturally distinct consonants in Persian ...... 13 olds looked more at the subject referent after aware of them, as 5- and 7-year-olds still only subject-clefts than object-clefts from 500-1000 reveal their preferences through their gaze be- Hel´ ene` Giraudo and Madeleine Voga ms after pronoun onset (p > .05), whereas adults havior. This supports Järvikivi et al.’s (2013) Words matter more than morphemes: evidence from masked priming with bound- did the same during the first 500 ms (p = .06). suggestion that children use the same cues as stem stimuli ...... 19 Adults also showed a general subject preference adults, but that they have not fully developed both offline (p > .001) and online (p > .05), spe- their ability to do so. Giulia Bracco, Basilio Calderone, and Chiara Celata cifically after subject-clefts as opposed to object- Phonotactic probabilities in Italian simplex and complex words: a fragment priming clefts offline (p > .05). Moreover, first-look data References study ...... 24 (first look at subject or object referent after pro- noun onset) revealed a stronger subject prefer- Dagmar Bittner and Milena Kuehnast. 2011. Compre- Jim Blevins, Petar Milin, and Michael Ramscar ence in 7-year-olds after subject-clefts than ob- hension of intersentential pronouns in child German Zipfian discrimination ...... 29 and child Bulgarian. First Language, 32(1-2), 176– ject-clefts (p > .05). We found no significant ef- 204. Gero Kunter fect of visual context in the children. However, Effects of processing complexity in perception and production. The case of English an interaction effect in adults showed that their Jeanette K. Gundel. 2002. Information structure and comparative alternation ...... 32 stronger subject preference in subject-clefts than the use of cleft sentences in English and Norwegian. object-clefts offline was only present when the Language and Computers, 39(1), 113–128. action was not depicted (p > .05). Claudia Marzi, Marcello Ferro, and Vito Pirrelli Joshua K. Hartshorne, Rebecca Nappa, & Jesse Lexical emergentism and the “frequency-by-regularity” interaction ...... 37 5 Conclusions Snedeker. 2014. Development of the first-mention bias. Journal of Child Language, 41(3), 1-24. Sebastian Pado,´ Britta D. Zeller, and Jan Snajderˇ The results from the time series data suggest that Morphological priming in German: the word is not enough (or is it?) ...... 42 adults process the pronouns faster than children, Juhani Järvikivi, Pirita Pyykkönen-Klauck, Sarah which supports Hartshorne et al. (2014). Schimke, Saveria Colonna, & Barbara Hemforth. Franc¸ois Morlane-Hondere` 2013. Information structure cues for 4-year-olds and What can distributional semantic models tell us about part-of relations? ...... 46 In contrast to the older children, the 3-year-olds adults: tracking eye movements to visually presented performed at chance level in all the different anaphoric referents. Language and Cognitive Pro- cesses, 0(0), 1–16. Ting Zhao and Victoria A. Murphy conditions. This may be due to what Hartshorne Modeling lexical effects in language production: where have we gone wrong? . . . 51

188 I Jens Fleischhauer Activating attributes in frames ...... 58 The role of grammar factors and visual context in Norwegian children’s pronoun resolution Melanie J. Bell and Martin Schafer¨ Modelling semantic transparency in English compound nouns ...... 63

Haim Dubossarsky, Yulia Tsvetkov, Chris Dyer, and Eitan Grossman Camilla Hellum Foyn Mila Vulchanova Rik Eshuis A bottom up approach to category mapping and meaning change ...... 66 Department of Department of Department of language and literature language and literature language and literature Maria Rosenberg and Ingmarie Mellenius NTNU NTNU NTNU What NN compounding in child language tells us about categorization ...... 71 camilla.foyn mila.vulchanova hendrik.eshuis

Fabio Montermini @ntnu.no @ntnu.no @ntnu.no Using distributional data to explore derivational under-markedness: a study of the event/property polysemy in nominalization ...... 76

Dimitrios Alikaniotis and John N. Williams 1 Introduction A distributional semantics approach to implicit language learning ...... 81 Example of the stimulus sentences: Most personal pronouns have one entry in the

mental lexicon, but they can have different refer- Anna Anastassiadis-Symeonidis 1. Introduction sentence: Suffixation and the expression of space and time in modern Greek ...... 85 ents depending on the context they appear in. They are sometimes fairly ambiguous. There is Der er hesten og kaninen also evidence that pronoun resolution is impaired Alessandra Zarcone, Sebastian Pado,´ and Alessandro Lenci There are the.horse and the.rabbit in many developmental deficits. Children have to Same same but different: type and typicality in a distributional model of comple- ment coercion ...... 91 learn how to find the intended referent, but we do 2a. Subject-cleft: not know much about how resolution strategies are acquired. How do visual context and syntac- Det er hesten som kiler kaninen Jukka Hyon¨ a,¨ Minna Koski, and Alexander Pollatsek tic context influence children’s pronoun pro- Identifying existing and novel compound words in reading Finnish: an eye move- It is the.horse that tickles the.rabbit cessing? Using eye-tracking, we investigate for ment study ...... 95 the first time the development of Norwegian 2b. Object-cleft: children’s pronoun resolution competencies in Paolo Canal, Francesca Pesciarelli, Francesco Vespignani, Nicola Molinaro, and Cristina Cac- their L1. Det er kaninen hesten kiler ciari It is the.rabbit the.horse tickles Electrophysiological correlates idioms comprehension: semantic composition does 2 The study not follow lexical retrieval ...... 98 The participants were monolingual 3-, 5-, and 7- 3. Ambiguous pronoun sentence: year-old children, as well as a control group of Sobh Chahboun, Valentin Vulchanov, David Saldana,˜ Hendrik Eshuis, and Mila Vulchanova Han kan telle til ti Metaphorical priming in a lexical decision task in high functioning . . . . . 102 monolingual adults. There were between 25 and 28 participants in each group. In the first of three He can count to ten experiments, they listened to it-cleft sentences Barbara Leone Fernandez, Manuel Perea, and Marta Vergara-Mart´ınez 4. Question sentence: ERP correlates of letter-case in visual word recognition ...... 106 with either subject focus (2a) or object focus (2b), while they watched illustrations of two an- Hvem kan telle til ti? imals (corresponding to the subject and the ob- Pier Marco Bertinetto, Chiara Celata, and Luigi Talamo Who can count to ten? ject) on a screen. It-clefts provide a good envi- Morphotactic effects on the processing of Italian derivatives ...... 109 ronment for testing syntactically expressed focus, and appear to be more frequent in Norwegian Conditions Tatiana Iakovleva, Anna Piasecki, and Ton Dijkstra than e.g., English (Gundel, 2002). The animals Are you reading what I am reading? The impact of contrasting alphabetic scripts were sometimes shown performing the actions 1 Subject-cleft Depicted action on reading English ...... 112 from the cleft-sentences, and other times not (see 2 Subject-cleft No depicted action Table 1 for overview of conditions). Thereafter, Daniel´ Czegel,´ Zsolt Lengyel, and Csaba Pleh´ the participants heard an ambiguous pronoun 3 Object-cleft Depicted action A study of relations between associative structure and morphological structure of sentence (3), and eye-tracking data were collect- Hungarian words ...... 117 ed to determine whether they looked at the sub- 3 Object-cleft No depicted action ject or object referent. In addition, offline data Hel´ ene` Giraudo and Serena Dal Maso were collected, by asking the participants to Table 1: Conditions. Suffix perceptual salience in morphological processing: evidence from Italian . . . 120 name or point at the pronoun referent (4).

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

II 187 [Peeters et al.2013] David Peeters, Ton Dijkstra, and Nana Huang Jonathan Grainger. 2013. The representation and A user-based approach to Spanish-speaking L2 acquisition of Chinese applicative processing of identical cognates by late bilinguals: operation ...... 124 RT and ERP effects. Journal of Memory and Lan- guage, 68(4):315–332. Hel´ ene` Giraudo and Karla Orihuela [Pérez et al.2010] Anita Méndez Pérez, Elizabeth D. Visual word recognition of morphologically complex words: effects of prime word Peña, and Lisa M. Bedore. 2010. Cognates fa- and root frequency ...... 128 cilitate word recognition in young Spanish-English bilinguals’ test performance. Early childhood ser- Jana Hasenacker,¨ Elisabeth Beyersmann, and Sascha Schroeder vices, 4(1):55. Language proficiency moderates morphological priming in children and adults . . . 132 [R Core Team2014] R Core Team, 2014. R: A Lan- guage and Environment for Statistical Computing. Natalia Slioussar and Anastasia Chuprina R Foundation for Statistical Computing, Vienna, Grouping morphologically complex words in the mental lexicon: evidence from Rus- Austria. sian verbs and nouns ...... 136

[Radanovic ´ et al.2014] Jelena Radanovic,´ Laurie Beth Radovan Garab´ık and Radoslav Br´ıda Feldman, and Petar Milin. 2014. Cognates and fre- Extraction and analysis of proper nouns in Slovak texts ...... 140 quency in Serbian and English. In The 3rd Rijeka Days of Experimental Psychology – REPSI, page 37, Rijeka, Croatia. Alessandro Lenci, Gianluca E. Lebani, Marco S. G. Senaldi, Sara Castagnoli, Francesca Masini, and Malvina Nissim [Rosselli et al.2012] Mónica Rosselli, Alfredo Ardila, Mapping the constructicon with SYMPAThy. Italian word combinations between María Beatriz Jurado, and Judy Lee Salvatierra. fixedness and productivity ...... 144 2012. Cognate facilitation effect in balanced and non-balanced spanish–english bilinguals using the Debela Tesfaye and Carita Paradis boston naming test. International Journal of Bilin- On the use of antonyms and synonyms from a domain perspective gualism , 18(6):649–662...... 150

[Strijkers et al.2009] Kristof Strijkers, Albert Costa, Rosario Caballero and Iraide Ibarretxe-Antunano˜ and Guillaume Thierry. 2009. Tracking lexical ac- From physical to metaphorical motion: a cross-genre approach ...... 155 cess in speech production: electrophysiological cor- relates of word frequency and cognate effects. Cere- Ida Raffaelli and Barbara Kerovec bral Cortex, 20:912–928. ’Taste’ and its conceptual extensions: the example of Croatian root kus/kusˇ and [Van Hell and De Groot2008] Janet G Van Hell and An- Turkish root tat ...... 158 nette MB De Groot. 2008. Sentence context modu- lates visual word recognition and translation in bilin- Javier E. D´ıaz-Vera guals. Acta psychologica, 128(3):431–451. Love in the time of the corpora. Preferential conceptualizations of love in world Englishes ...... 161 [Van Hell and Dijkstra2002] Janet G. Van Hell and Ton Dijkstra. 2002. Foreign language knowledge can in- fluence native language performance in exclusively Cristina Cacciari, Francesca Pesciarelli, Tania Gamberoni, and Fabio Ferlazzo native contexts. Psychonomic Bulletin & Review, Is black always the opposite of white? The comprehension of antonyms in schizophre- 9(4):780–789. nia and in healthy participants ...... 166

Simon De Deyne and Steven Verheyen Using network clustering to uncover the taxonomic and thematic structure of the mental lexicon ...... 172

Michael Richter and Jurgen¨ Hermes Classification of German verbs using nouns in argument positions and aspectual features ...... 177

Maja Andel, Jelena Radanovic,´ Laurie Beth Feldman, and Petar Milin Processing of cognates in Croatian as L1 and German as L2 ...... 182

Camilla Hellum Foyn, Mila Vulchanova, and Rik Eshuis The role of grammar factors and visual context in Norwegian children’s pronoun resolution ...... 187

Acknowledgements ...... 189

186 III biguous cues, such as cognate words, competition 2010. How cross-language similarity and task de- between cues does not emerge and the latter mands affect cognate recognition. Journal of Mem- learned relationships will show some preference. ory and language, 62(3):284–301. Previous research on highlighting indicates that [Duyck2005] Wouter Duyck. 2005. Translation and this pattern might be even more pronounced associative priming with cross-lingual pseudoho- when the cues are verbally (i.e., linguistically) mophones: evidence for nonselective phonologi- cal activation in bilinguals. Journal of Experimen- encoded (Kruschke et al., 2005; Kruschke, 2009). tal Psychology: Learning, Memory, and Cognition, This is what present results confirm as well. 31(6):1340. [Kamin1969] Leon J Kamin. 1969. Predictability, sur- prise, attention, and conditioning. In B Campbell Acknowledgments and R Church, editors, Punishment and aversive be- haviour, pages 279–296. Appleton-Century-Crofts. The research for this paper was financially sup- ported by the Short visit grant 4784 received by [Kelley and Kohnert2012] Alaina Kelley and Kathryn NetWordS-09-RNP-089, as well as by the Min- Kohnert. 2012. Is there a cognate advantage istry of Education, Science and Technological for typically developing Spanish-speaking English- language learners? Language, speech, and hearing Development of the Republic of Serbia grants services in schools, 43(2):191–204. ON179006 and ON179033. Furthermore, we wish to thank the Ministry of Science, Education and [Kroll et al.2002] Judith F Kroll, Erica Michael, Natasha Tokowicz, and Robert Dufour. 2002. The Sports of the Republic of Croatia for supporting development of lexical fluency in a second language. this research within the framework of the project Second language research, 18(2):137–171. 130-1300869-0826 “Croatian and German in Con- [Kruschke and Hullinger2010] John K Kruschke and tact – sociocultural aspects and paradigms of com- Richard A Hullinger. 2010. Evolution of attention munication”. in learning. In Nestor Schmajuk, editor, Computa- tional models of conditioning, pages 10–52. Cam- bridge University Press, Cambridge, UK. References [Kruschke et al.2005] John K Kruschke, Emily S Kap- [Arnon and Ramscar2012] Inbal Arnon and Michael penman, and William P Hetrick. 2005. Eye gaze Ramscar. 2012. Granularity and the acquisition of and individual differences consistent with learned grammatical gender: How order-of-acquisition af- attention in associative blocking and highlighting. fects what gets learned. Cognition, 122(3):292–305. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31(5):830–845. [Baayen and Milin2010] R.H. Baayen and P. Milin. 2010. Analyzing reaction times. Journal of Psy- [Kruschke2009] John K Kruschke. 2009. Highlight- chological Research, 3(2):12–28. ing: A canonical experiment. Psychology of Learn- ing and Motivation, 51:153–185. [Bates et al.2014] Douglas Bates, Martin Mächler, Ben [Kuznetsova et al.2014] Alexandra Kuznetsova, Per Bolker, and Steve Walker. 2014. Fitting linear Bruun Brockhoff, and Rune Haubo Bojesen Chris- mixed-effects models using lme4. arXiv preprint tensen, 2014. lmerTest: Tests in Linear Mixed arXiv:1406.5823. Effects Models. R package version 2.0-20.

[Caramazza and Brones1979] Alfonso Caramazza and [Lemhöfer and Dijkstra2004] Kristin Lemhöfer and Isabel Brones. 1979. Lexical access in bilinguals. Ton Dijkstra. 2004. Recognizing cognates and in- Bulletin of the Psychonomic Society, 13(4):212–214. terlingual homographs: Effects of code similarity in language-specific and generalized lexical decision. [De Groot and Nas1991] Annette MB De Groot and Memory & Cognition, 32(4):533–550. Gerard LJ Nas. 1991. Lexical representation of cognates and noncognates in compound bilinguals. [Midgley et al.2011] Katherine J. Midgley, Phillip J. Journal of memory and language, 30(1):90–123. Holcomb, and Jonathan Grainger. 2011. Effects of cognate status on word comprehension in second [Dijkstra et al.1999] Ton Dijkstra, Jonathan Grainger, language learners: An ERP investigation. Journal and Walter JB Van Heuven. 1999. Recognition of Cognitive Neuroscience, 23(7):1634–1647. of cognates and interlingual homographs: The ne- glected role of phonology. Journal of Memory and [Mulder et al.2011] Kimberley Mulder, M Dijkstra, and language, 41(4):496–518. T Schreuder. 2011. Are eye-fixations in cognate processing dependent on entropy? In The 17th [Dijkstra et al.2010] Ton Dijkstra, Koji Miwa, Bianca Meeting of the European Society for Cognitive Psy- Brummelhuis, Maya Sappelli, and Harald Baayen. chology [ESCOP 2011].

IV 185 ence and the constellation of cues available in the Foreword learning environment. In particular, knowledge in L1 as well as learning history will determine the This international conference “Word Knowledge and Word Usage: Representations and pro- degree and style of interference that we encounter cesses in the mental lexicon” is the final outcome of 4 years of intense multi-disciplinary when learning an L2. This kind of blocking effect research networking and cooperation funded by the European Science Foundation within the framework of the NetWordS programme (May 2011 - April 2015). is well documented in learning theory (Kamin,

1969). NetWordS’ mission was to bring together experts of various research fields (from brain sci- A blocking effect describes failures of learn- ences and computing to cognition and linguistics) and of different theoretical inclinations, ing that arise when a target cue is presented with to advance the current awareness of theoretical, typological, psycholinguistic, computational another cue whose informativity with respect to and neurophysiological evidence on the structure and processing of words, with a view to de- an outcome has already been established. Arnon veloping novel research paradigms and bringing up a new generation of language scholars. The conference was intended to provide a first forum for assessing current progress of cross- and Ramscar (2012) demonstrated in great detail disciplinary research on language architecture and usage, and discussing prospects of future how blocking may influence L2 acquisition when synergy. cues from the two languages are competing for the same outcome (a symbolic lexical representation). People are known to memorise, parse and access words in a context-sensitive and opportunis- Cue blocking does not apply directly to cog- tic way, by caching their most habitual and productive processing patterns into routinized nates, however, because typically, cues are iden- behavioural schemes. Speakers not only take advantage of token-based information such as frequency of individual, holistically stored words, but they are also able to organise stored tical and, thus, cannot compete and/or block each words through paradigmatic structures (or word families) whose overall size and frequency other. Nonetheless, Arnon and Ramscar’s general is an important determinant of ease of lexical access and interpretation. Accordingly, lexical observation regarding the way in which learning organisation is not necessarily functional to descriptive economy and minimisation of storage, is structured helps to make sense of the present but to more performance-oriented factors such as efficiency of memorisation, access and recall. findings. All that is needed is to extend it to the Usage-based approaches to word processing lend support to this view, to promote explana- distinctive properties of cognates whereby learn- tory frameworks that aim to investigate the stable correlation patterns linking distributional ing entails mapping the very same cues (cognate entrenchment of lexical units with productivity, internal structure and ease of interpretation. Ultimately, this is intended to establish a deep interconnection between performance-oriented, word forms) onto the same outcome. low-level lexical functions such as memorisation, rehearsal, access and recall, and their neu- Further insights derive from the highlighting roanatomical correlates. effect (Kruschke, 2009) on the target cues. First, the theory predicts that contextual (ambient) cues The impressive wealth of data and approaches reported in 23 oral presentations and 19 posters are informative about the learning cues, but not (selected from 84 original submissions), and the broader perspectives broached by Wolfgang about outcomes (Kruschke and Hullinger, 2010). U. Dressler, Marta Kutas, Gabriella Vigliocco and Michael Zock, provided compelling evidence that the time has now come for this area to make a significant methodological leap towards Therefore, temporal and/or contingency aspects tighter and targeted synergy. The overall conference message was clear. Interdisciplinarity of the situation are useful for discriminating be- should be coupled with both theoretical modelling and quantitative analysis of empirical ev- tween specific contexts of learning. Second, learn- Figure 1: Three-way interaction language by cog- idence. Any truly interdisciplinary effort must take advantage of the many methodological ing cues can be unambiguous or ambiguous for nates by frequency to reaction time latencies in vi- caveats that psycholinguists, neurolinguists, theoretical and cognitive linguists, historical lin- a particular outcome, and the highlighting effect guists, typologists and computational linguists have developed over many years of relatively sual lexical decision task. predicts that early ambiguous and late unambigu- independent work. Integration of their data and approaches will necessary mean more com- ous cues are more informative (Kruschke, 2009). plex models, far more constrained, explanatory and comprehensive than any other account put forward so far. There is general consensus that joining forces in this research area will Thus, the availability of either L1 or L2 (but not to another sample of participants (studying L2 as not only lead to considerable progress in our theoretical understanding of the physiology of their major) and another L1/L2 combination. The both) provides a context for a given cognate cue communication, but will also be conducive to more effective ways to help real people engaged fact that target frequency played an important role (actively present in the sensory input). Given high- in their daily communicative exchanges. seems more compatible with an account based on lighting mechanism, with cognate forms are un- proficiency. ambiguous cues we expect facilitation for a lat- The conference was held in Pisa at the Scuola Normale Superiore, between March 30th and April st However, to find a general explanation and ter learned outcome. Conversely, ambiguous cues 1 2015, and benefited from the invaluable support and advice of Prof. Pier Marco Bertinetto and his team, to whom our warmest thanks go. testable hypotheses we turn to learning theory. should facilitate an earlier learned outcome as in Arnon and Ramscar (2012), who investigated how an L1 context and, hence, noncognates ought to adult learners acquire an artificial L2, convinc- be faster in L1 but slower in L2. Pisa, April 2nd, 2015 ingly demonstrated that “the way in which learn- In summary, in the case of ambiguous cues ing is structured has a considerable impact on what highlighting is in essence a blocking effect: Vito Pirrelli gets learned” (p. 302). In general, knowledge firstly learned relationships will be favored. This Claudia Marzi acquisition is codetermined by discrepancies be- outcome is fully consistent with the account by Marcello Ferro tween expectations based on our previous experi- Arnon and Ramscar (2012). In the case of unam-

184 1 Psycholinguistic illusions in and on morphology 2 Experiment fect Modeling (LMM), in the R software envi- ronment for statistical computing (R Core Team, Wolfgang U. Dressler Late bilinguals of German (N = 69) – students of 2014), with the lme4 and the lmerTest pack- University of Vienna German with Croatian as their L1, participated in [email protected] ages (Bates et al., 2014; Kuznetsova et al., 2014). a visual lexical decision experiment. There were The refitted model (after removing residual values two forms of the experiment (in Croatian and in greater than 2.5 of absolute standardized units), re- Fruitful interdisciplinary contact between specialists in theoretical morphology and in various German), and students were randomly assigned to branches of psycholinguistics (my examples will come from acquisition, processing, aphasia) vealed a significant effects of the control predic- one version. The entire experiment (materials and is hampered by reciprocal illusions, some of them rarely criticised explicitly. Often ecological tors, in the expected direction: facilitation from validity is dubious. instruction) was in one language and presentation order of a presentation (β = .044; SE = .007; − β sequence was randomized for each participant. t = 6.42; P r(> t ) < .0001), and inhibition In preparation for their study, Radanovic´ et − | | from the word length (β = .211; SEβ = .023; al. (2014) also conducted a normative survey with t = 9.33; P r(> t ) < .0001). Also, there was | | 1000 Serbian – English translation equivalents a significant effect of the lexicallity of the previ- The bridge of iconicity: from a world of experience to experience ranging from pairs consisting of completely ous word, where stimuli preceded by a word were of language different words (e.g., pricaˇ – story) to the identical recognized faster than those preceded by a pseu- cognates (e.g., drama – drama). They then doword (β = .077; SE = .005; t = 14.36; − β − Gabriella Vigliocco selected 400 noun pairs covering a wider range P r(> t ) < .0001). University College London of ortho-phonological similarity between L1/L2 | | Most interestingly, the model revealed a sig- [email protected] words, using both subjective similarity ratings nificant three-way interaction between word fre- as well as Levenshtein distance. In the present quency, language and cognate status (β = .053; Arbitrariness between linguistic form and meaning is taken as foundational in language stud- study we made use of 344 of the previously rated SEβ = .012; t = 4.44; P r(> t ) < .0001). ies and the question of how linguistic form links to meaning is central to language development, word pairs, and constructed the same number of | | processing and evolution. But, languages also display iconicity in addition to arbitrariness. The observed interaction is an almost exact repli- pseudowords. All of the selected 344 pairs fitted This is especially evident in sign languages. This, what if the study of language started from cation of the three-way interaction reported by nicely for the present purposes of studying Croat- signed rather than spoken languages? In the talk I will explore this question. Radanovic´ et al. (2014): cognates are processed ian – German cognates, consistently ranging from faster than noncognates in German (L2), but perfect cognates to orthographically different slower than noncognates in Croatian (L1), and the words. We reused the same noun pairs to allow size of the effect is attenuated for high frequency for strict comparisons of the experimental data. words. This pattern of results is depicted in Figure Needles in a haystack and how to find them. Can 1. neuroscientists, psychologists and computational linguist help 2.1 Results With regards to the random-effects structure, us (to build a tool) to overcome the “tip of the tongue” problem? by-participant and by-item adjustments to the We calculated normalized Levenshtein distance intercept significantly contributed to the model’s Michael Zock measure for pairs of nouns used in two forms of goodness-of-fit. Word frequency and trial order LIF-CNRS, University of Marseille the present experiment. Similarly to the study of [email protected] needed additional by-participant adjustments for Radanovic´ et al., the distribution of the Leven- the slopes. Similar by-participant adjustments stein distance measure was strictly bimodal, and, Whenever we speak, read or write we always use words, the exchange money of concepts they for the slope were held by the word length, as before, the modes matched cognate vs. noncog- which also revealed significant correlation be- are standing for. No doubt, words ARE important. Yet having stored “words” does not guarantee nate distinction. That allowed us to further use that we can access them under all circumstances. Some forms may refuse to come to our mind tween adjustments for the intercept and the slope TRUE FALSE when we need them most, the moment of speaking or writing. This is when we tend to reach a dummy-coded variable cognate ( / ), (r = .72), indicating that slower and more − for a dictionary, hoping to find the token we are looking for. same as in the original study (Radanovic´ et al., careful participants were slowed less as item The problem is that most dictionaries, be they in paper or electronic form, are not well suited 2014). length increased. to support the language producer. Hence the questions, why is this so and what does it take Furthermore, we transformed the measures to to enhance existing resources? Can we draw on what is known about the human brain or its ensure a better approximation to a Gaussian dis- externalized form (texts)? Put differently, what kind of help can we expect by looking at the tribution. Word frequencies and word length 3 Discussion work done by neuroscientists, psycholinguists or computational linguistics? These are some of were log-transformed, while an inverse transfor- the questions I will briefly touch upon, by ending with a concrete proposal (roadmap), outlining the majors steps to be performed in order to enhance an existing electronic resource. mation was applied to response latencies, follow- Radanovic,´ Feldman, and Milin (2014) suggested ing Baayen and Milin (2010). that cognate facilitation in L2 and inhibition in L1 As a last step, we excluded a small number of might be specific to the particular pairing of first the extreme outliers (0.07%) from further analysis and second language and/or to the level of profi- based on the visual inspection of the reaction time ciency in the L2. Results of the present study show distribution. that the particular L1/L2 combination is not criti- The data were analyzed with Linear Mixed Ef- cal in the sense that the same pattern generalized

2 183 Content and organization of knowledge and its use in language Processing of cognates in Croatian as L1 and German as L2 comprehension Marta Kutas Maja Andel¯ Jelena Radanovic´ University of California, San Diego University of Zagreb Laboratory for Experimental Psychology [email protected] [email protected] University of Novi Sad [email protected] Significant work takes place at the language-memory interface that supports word and sen- tence processing. Both the content and the functional organization of our world knowledge im- pact language comprehension in real time. Each cerebral hemisphere is involved, albeit in dif- Laurie Beth Feldman Petar Milin ferent ways. The nature of knowledge organization (associative, categorical, events, perceptuo- SUNY, The University at Albany University of Novi Sad motor) and their use in predictive and/or integrative language processing have been revealed Haskins Laboratories Eberhard Karls University Tübingen via investigations employing event-related brain potentials (ERPs). I will review some of our [email protected] [email protected] electrophysiological work supporting the idea that language processing is immediate and in- cremental, contextual, sometimes predictive, multi-modal, and bi-hemispheric.

1 Introduction the cognate effect. Specifically, most studies find facilitation in the processing of cognates in L2 (Di- Cognates are defined as words similar in form and jkstra et al., 1999; Lemhöfer and Dijkstra, 2004; meaning across two languages. Similarity in form Van Hell and De Groot, 2008), but results are less may range from full orthographic overlap, as in clear when it comes to the effect of cognates in English film – German Film, to partial overlap, L1. For example, Van Hell and Dijkstra (2002) as in English chapel – German Kapelle. Some and Duyck (2005) reported cognates facilitation in pairs of cognate words developed historically from the dominant language, while Kroll et al. (2002) a common ancestor word, whereas others emerge reported small cognate inhibition in an L1 naming when languages come into contact and loan each task, and Caramazza and Brones (1979) failed to other words. Language users are typically un- find such an effect at all. aware of such diachronic pressures. When acquir- In the present study we sought to examine the ing a second language (L2) they can only perceive influence of cognates on lexical processing in a shared elements between L1 and L2. visual lexical decision task, using L1/L2 language Cognates help explain the nature of lexical pro- pairs that belong to different subgroups of Indo- cessing and the manner in which elements from European languages: Slavic L1 and Germanic L2. the two languages interact. Different measures The aim was to carefully replicate recent find- have been used to explore cognate processing ings from a study by Radanovic,´ Feldman, and and representation, including ERP (Midgley et Milin (2014). Crucially, their study showed quite al., 2011; Peeters et al., 2013; Strijkers et al., a complex pattern of effects that included a three- 2009), latencies in single word (Dijkstra et al., way interaction of language (Serbian L1 vs. En- 2010; Lemhöfer and Dijkstra, 2004), and primed glish L2) by cognate status (cognate vs. noncog- lexical decision (De Groot and Nas, 1991), eye- nate) by word frequency (as a numerical predictor movements (Mulder et al., 2011; Rosselli et al., – covariate). Cognates were processed faster than 2012), and scores on standardized tests (Kelley noncognates in L2, but, surprisingly, significantly and Kohnert, 2012; Pérez et al., 2010). Taken to- slower than noncognates in L1. Furthermore, the gether, empirical findings support the claim that size of the effect was greater when word frequency cognates are processed differently from noncog- was low. nate words. Despite the fact that the aforemen- Because this pattern of effects differs from what tioned experimental measures and techniques di- is typically reported in the literature, we designed verge, the conclusion is similar both in language a replication of the Radanovic´ et al. study and production and in language comprehension (Dijk- followed their method and design, this time using stra et al., 2010, for an overview). Nevertheless, another contrasting pair of languages: Croatian results do differ with respect to a range of details, (L1) and German (L2). including the direction as well as the magnitude of

Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

182 3 Implicative structure and joint predictiveness References Zeno Vendler 1967. Linguistics in Philosophy. Ithaka /New York: Cornell University Judith Aissen 2003. Differential Object Marking: Press. Natural Language and Olivier Bonami Sacha Beniamine Iconicity vs. economy. Warren Weaver 1955. Translation. In W. N. Universite´ Paris-Sorbonne Universite´ Paris Diderot Linguistic Theory, 21: 435 – 483. Locke, D.A. Booth, Machine Translation Laboratoire de linguistique formelle Laboratoire de linguistique formelle & Bernd Bohnet 2010. Very High Accuracy and of Languages: 15 – 23. Cambridge Mass.: (U. Paris Diderot & CNRS) Alpage (Inria & U. Paris Diderot) Fast Dependency Parsing is not a MIT Press. Contradiction. The 23rd International [email protected] Conference on Computational Linguistics (COLING 2010), Beijing, China. William Croft 2003. Typology and Universals. 1 Introduction Overall, no other cell in the paradigm is a very Cambridge University Press (2nd edition). good predictor of the past participle. However, George W. Furnas, Thomas K. Landauer, Louis (Ackerman et al., 2009) define the PARADIGM joint knowledge of some pairs of paradigm cells M. Gomez, and Susan T. Dumais 1983. Sta- CELL FILLING PROBLEM (PCFP), which we radically improves the quality of prediction. For tistical semantics: Analysis of the potential paraphrase in (1), as the cornerstone of the study instance, joint knowledge of the infinitive and performance of keyword information systems. of inflectional paradigms. Bell System Technical Journal, 62 (6): some present plural form removes all uncertainty 1753 – 1806. (1) How do speakers know how to inflect the full in the sample in Table 1: knowledge of the in- Wolfgang Klein 2009. How time is encoded. In paradigm of a lexeme on the basis of expo- finitive form partitions the set of lexemes in two W. Klein and P. Li (eds.), The expression of sure to only some of its forms? classes within which the PRS.3PL is fully predic- time, 39 – 82. Berlin: Mouton de Gruyter. tive of the past participle. (Ackerman et al., 2009) go on to argue that Thomas K. Landauer and Susan T. Dumais Although the existence of joint predictiveness is speakers rely on knowledge of the IMPLICA- 1997. A solution to Plato's problem: The La- acknowledged in the literature (Matthews, 1972; TIVESTRUCTURE of paradigms (Wurzel, 1984): tent Semanctic Analysis theory of the acquisi- paradigms are structured in such a way that there Thyme´ et al., 1994; Ackerman et al., 2009; Stump tion, induction, and representation of knowledge. Psychological Review, 104: are reliable correlations between the form filling and Finkel, 2013; Blevins, forthcoming; Sims, 211 – 140. one paradigm cell A and the form filling another forthcoming), little attention has been given to quantifying its importance. In this paper we first Patrick Pantel 2005. Inducing ontological co- cell B. The reliability of these correlations de- occurrence vectors. In Proceedings of As- pends on the particular pair of cells A and B un- give further arguments that joint predictiveness is a crucial aspect of implicative structure, and that a sociation for Computational Linguistics der scrutiny; it can be assessed quantitatively by (ACL-05): 125 – 132. careful empirical examination of joint predictive- examining the statistical distribution of operations Michael Richter and Roeland van Hout 2015. A ness is essential to both linguistic and psycholin- required to go from A to B in the lexicon. classification of German verbs using empirical This presentation focuses on one particular guistic assessment of the PCFP and related issues. language data and concepts of Vendler and aspect of implicative structure, which we call We then propose and illustrate a method for the Dowty. To appear 2015 in Sprache und quantitative evaluation of joint predictiveness. We JOINTPREDICTIVENESS. In some situations, joint Datenverarbeitung – International Journal for knowledge of two paradigm cells A and B pro- end with a discussion of principal part systems. Language Data Processing. vides more information on cell C than could be Herbert Rubenstein and John B. Goodenough inferred from knowledge of either A or B. Table 1 2 The relevance of joint predictiveness 1965. Contextual correlates of synonomy. In below provides a simple example from French, us- Communications of the ACM, Vol 8 (10): 627 We start by establishing that speakers do have the – 633. ing lexemes illustrating 7 patterns corresponding opportunity to use joint predictiveness. Figure 1 Helmut Schumacher 1986. Verben in Feldern. 95% Flex- to of of the verbs documented in the plots how the number of forms per lemma evolves Valenzwörterbuch zur Syntax und Seman- ique phoneticized lexicon (Bonami et al., 2014). when walking through the 1.6 billion words of tik deutscher Verben. Berlin & New York: In French conjugation, predicting the past par- the FrWaC web corpus (Baroni et al., 2009), re- De Gruyter. ticiple from the infinitive is hard, because of the stricting attention to the 6847 verbs documented Hinrich Schütze and Jan Pedersen 1993. A vector opacity between second conjugation infinitives, in the Lefff lexicon (Sagot, 2010) to compensate model for syntagmatic and paradigmatic such as BATIRˆ , and some third conjugation in- for tagging errors.1 Note that 1.6 billion words is relatedness. In Making Sense of Words: 104 – finitives, such as TENIR, OUVRIR, MOURIR. Pre- 113. Ninth Annual Conference of the UW 1 dicting the past participle from present SG forms Note that this restriction leads to overestimating the av- Centre for the New OED and Text Research, erage number of forms per lemma, as neologisms, very rare Oxford. is also hard, this time because some first conju- words and hapaxes not present in the lexical resource are not gation verbs with a stem in -i (e.g. RELIER) are included. We are counting distinct forms rather than distinct Peter D. Turney and Patrick Pantel 2010. From not distinguished from second conjugation verbs. paradigm cells, as there is currently no tagger for French that Frequency to Meaning: Vector Space Models reliably disambiguates homographic forms of the same lex- A different subset of first conjugation verbs (e.g. of Semantics. Journal of Artificial Intelli- eme. French verbs have 51 paradigm cells, and the average gence Research, 37: 141 – 188. RATISSER) raises similar problems for PL forms. number of distinct forms per verb in the Lefff lexicon is 35.8.

Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi,⃝ Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

4 181 Lexeme INF PRS.3SGPRS.3PLPST.PTCP # In contrast the remaining features, including the Figure 2. Accuracy of the argument and aspectual aspectual feature which yields .514 % accuracy, features using five aspectual verb classes vs. ten LIVRER ‘deliver’ livKe livK livK livKe 4108 classes with concrete lexical properties as gold RELIER ‘link’ K@lje K@li K@li K@lje 210 with k = .317 (fair agreement), perform poorly. RATISSER ‘rake’ Katise Katis Katis Katise 22 Taking the classification according to concrete standard. BATIRˆ ‘build’ 327 semantic properties into ten classes as the gold batiK bati batis bati Note: s: subject, d: direct object, i: indirect object, p: TENIR ‘hold’ t@niK tj˜E tjEn t@ny 37 standard we observed that the hierarchy remains prepositional object, as: aspectual features, and OUVRIR ‘open’ uvKiK uvK uvK uvEK 8 MOURIR ‘die’ 1 almost the same, the subject feature outperforms combinations of predictors, for instance, das: direct muKiK mœK mœK mOK the remaining features. However, the accuracy is object and aspect, sp: subject and prepositional object. Table 1: Exemplary paradigms for inflection patterns for 4-cell subparadigms of French verbs (data from considerably lower compared to the Flexique — 5% of the lemmas illustrating minor patterns have been excluded) classification with 5 aspectual verb classes. The 3 Conclusion subject achieves .657 accuracy, k = .573. The The study provides evidence for the hypothesis combinations subject-direct object-aspectual that aspectual verb classes can be induced from in the order of magnitude of the overall linguis- ness from two cells to infer the likely form of the features and subject-direct object-prepositional classified nominal fillers in argument positions. tic exposure of an adult speaker. The distribution participle. object yield .628 accuracy with k = .458, For the five aspectual verb classes used as the followed by the combinations subject-direct strongly suggests that, as speakers get exposed to The final observation is that there are important gold standard (Richter and van Hout, 2014) it more words, paradigms fill slowly on average, so linguistic generalizations that can only be obtained object and subject-aspectual features with .6 turned out that noun classes in subject positions that predicting unknown forms stays relevant; at by looking at joint predictiveness. To supplement accuracy each and k = .495. These combinations have the highest predictive power compared to the same time, speakers are massively exposed to the French data presented in the introduction, let exhibit a moderate agreement. Again, the the nouns in the remaining argument positions multiple forms of the same lexemes, which makes aspectual feature performs poorly with .428 and the aspectual features derived from Vendler us consider a spectacular example from European accuracy, k = .266 which is a fair agreement. In (1967). This result is surprising since the knowledge of joint predictiveness relevant to ad- Portuguese, concerning the prediction of the form figure 2 the accuracy of the argument and Vendlerian aspectual categories were formulated dressing the PCFP. of the infinitive from those of the present singu- aspectual features for the comparisons against in order to distinguish aspectual classes. Future A second relevant observation is that speakers lar. Table 2 presents relevant data. Because it does both gold standard classifications are given. research should explore a comparison of the do manifest knowledge of joint predictiveness. Al- not contain a theme vowel, the present 1SG is a predictive power of nominal and aspectual though this topic deserves dedicated experimen- bad predictor of the infinitive: a priori, any present features. tal studies that are beyond the scope of this pa- 1SG could correspond to a first, second or third Using a classification into concrete lexical fields per, circumstantial evidence from speech errors conjugation verb. 2SG and 3SG forms are slightly as the gold standard of the predictive values we is easy to find. One common conjugation error better predictors, as they distinguish first conju- observed a considerable decrease in the in French (Kilani-Schoch and Dressler, 2005) is gation endings (-5S,-5) from second/third conjuga- predictive values indicated by the lower kappa to use mouru as the past participle of MOURIR, tion endings (-@S,-@); the distinction between the values. We explain this result by the difference whereas mouri is almost never used (140 rele- two last conjugations is still neutralized. How- in information provided by the argument vant occurrences of mouru in the full FrWaC cor- ever, if a verb has a mid prethematic vowel in the structures of the verbs in the 5 class-gold pus, 0 or mouri). This would be surprising if 2SG and 3SG, the shape of that vowel is raised standard classification in contrast to the information provided by co-occurrences that is, speakers were analogizing from a single paradigm to high-mid in the 1SG in the second conjugation lexical information of any type in the context of cell: given knowledge of the sole infinitive, mouri (witness RECEBER, RECORRER), and to high in verb in the 10-class gold standard classification. would be the most likely regularization; given the third conjugation (witness SEGUIR, SUBIR). The results of this study show that: 1. Aspectual knowledge of some present form, mour or meur Whether one sees this phenomenon as the result verb classes can be empirically validated, 2. would be expected.2 Thus the property speakers of a synchronic vowel harmony in the 1SG oper- Classified nouns in subject argument positions seem to be sensitive to is the existence of an al- ating prior to theme vowel deletion (Mateus and are reliable predictors of aspectual verb classes, lomorphic relation between the infinitive and the d’Andrade, 2000) or as a historical remnant with i.e. the meaning of nouns in combination with present stem—hence, employing joint predictive- no synchronic motivation, it remains that on the their noun classes correlates with aspectual parts surface, for verbs with a mid prethematic vowel of the verbal meaning. In order to confirm these 2A reviewer points out that if speech errors are due to analogy to the nearest (frequent) neighbor, mouru is unsur- in the 2SG and 3SG, knowledge of the 1SG disam- results further research with an extended test set prising, as courir (past participle couru) is the most frequent biguates whether the verb belongs to the second or of verbs is needed. of the verbs whose infinitive is at a minimal edit distance from third conjugation and thus helps predict the infini- mourir. This assumption however is not plausible. Witness tive. Acknowledgments the case of the verb dire, whose present 2PL dites is very com- monly overregularized to disez. The most frequent phonolog- ical neighbor of dire is lire; however, according to the lexique 3 Quantifying joint predictiveness Roeland van Hout suggested to evaluate the database (New et al., 2007), dire is 8 times more frequent classifications with Cohen‘s Kappa and was very than lire in written French, and 17 times in spoken French. To assess the importance of joint predictiveness, helpful in the calculations of the k-values. It is thus not plausible that analogical regularization is driven by the closest neighbor; rather, it is driven by general pat- we build on previous proposals by (Bonami and

terns applying across lexemes—for instance, dire is one of a Boye,´ 2014) and (Bonami and Lu´ıs, 2014) on handful of exceptions to the regular Xons Xez alternation ∼ the evaluation of predictiveness from a single between 1PL and 2PL, that is overwhelmingly prevalent both in type and token frequency. paradigm cell, themselves improving on (Acker-

180 5 20 100 Mean forms per lemma % of lemmas with moreMean than forms 1 form per lemma (right) SVM classifier with a non-linear kernel which müssen* ‘to must’, einschlafen* ‘to fall asleep’ , achieved the best results. vergehen ‘to go (by)/ to pass/to disappear’, 80 15 We first trained the SVM using the classification übersehen ‘to overlook’, fehlen ‘to lack’, of Richter and van Hout (2014) as a gold verlieren ‘to loose’, verhindern ‘to prevent’, 60 standard and tested it with a 10-fold cross- abgrenzen ‘to mark off/ to define’, abweichen

10 validation. The gold standard classification in ‘to deviate

lemmas of % detail: 4. verbs of transfer (of information):

Number of forms of Number 40 mitteilen ‘to inform’, übermitteln ‘to

5 1. accomplishments: communicate/to forward’ 20 aufbauen auf ‘to build on/to be based on’, 5. verbs of examination (by mental activity): herstellen ‘to produce’, schneiden ‘to cut‘, nachprüfen ‘to ascertain/to check’, erörtern ‘to zersägen ’to saw into pieces‘, verlängern ‘to debate’, untersuchen ‘to examine’ 0 0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 size of the corpus 1e9 extend’, mitteilen ‘to tell/to inform’, übermitteln 6. verbs of production: ‘to communicate/to forward’, verhindern ‘to aufbauen auf ’to build on‘/acc to be based on’, Figure 1: Mean number of forms per lemma and proportion of lemmas with multiple forms as a function prevent’, abgrenzen ‘mark off/to define’ herstellen ‘to produce’ of vocabulary size (FrWaC corpus) 2. accomplishments with affected subject: 7. verbs of beginning and rising processes: untersuchen ‘to examine‘, bedenken ’to anfangen ‘to begin’, ansteigen ‘to rise/ to INF 1SG 2SG 3SG 1PL 2PL 3PL consider‘, erörtern ‘to debate’, nachprüfen ‘to increase’ ascertain/to check’, aufessen ‘to eat up’, essen 8. verbs of discussion and consideration: LEVAR l@"vaR "lEvu "lEv5S "lEv5 l@"v5muS l@"vaiS "lEv˜5u NOTAR nu"taR "nOtu "nOt5S "nOt5 nu"t5muS nu"taiS "nOt˜5u ‘to eat’ betreffen ‘to concern’, bedenken ‘to consider’, 3. activities: eingehen auf ‘to respond to so./sth.’, halten für RECEBER r@s@"beR r@"sebu r@"sEb@S r@"sEb@ r@s@"bemuS r@s@"b5iS r@"sEb˜5˜ı RECORRER r@ku"reR r@"koru r@"kOr@S r@"kOr@ r@ku"remuS r@ku"r5iS r@"kOr˜5˜ı laufen ‘to walk/to run‘, eingehen auf ‘to respond ‘to take, richten auf ‘to direct towards‘, to so./sth.’, hämmern ‘to hammer’, ansteigen ‘to orientieren an ‘to be geared to’ SEGUIR s@"giR "sigu "sEg@S "sEg@ s@"gimuS s@"giS "sEg˜5˜ı SUBIR su"biR "subu "sOb@S "sOb@ su"bimuS su"biS "sOb˜5˜ı increase‘ 9. verbs of membership and agreement: 4. achievements: angehören ‘to belong to’, übereinstimmen mit ’to Table 2: Selected European Portuguese verbs in the infinitive and present indicative einschlafen ‘to fall asleep‘, vergehen ‘to go agree with’ (by)/to pass/to diasappear‘, übersehen ‘to 10. folgen aus ‘to follow from’, laufen ‘to overlook’, verlieren’to loose’, anfangen ‘to walk/to run’, existieren ‘to exist’, man et al., 2009) and (Ackerman and Malouf, classification from raw data is an open research begin‘, abweichen ‘to deviate‘, sich orientieren verlängern ‘to extend’ 3 2013). Specifically, for every pair of paradigm question, we opportunistically use the algorithm an ‘to be geared to‘, richten auf ‘to direct cells A and B, we infer a classification of pat- sketched in (2) that we know to give satisfactory towards/to focus’ 2.1 Results terns of alternation relating these two cells. These results for the languages at hand. 5. states: patterns are then used to define a random vari- existieren ‘to exist‘, fehlen ‘to lack‘, müssen ‘to In order to evaluate the consistency of the (2) a. For any pair of strings ϕ , ϕ , find able A B over pairs of forms corresponding to ⟨ 1 2⟩ must‘, halten für ‘to take so./sth.for so./sth.‘, comparisons of the classifications against the ∼ strings α, γ, β , β , δ and δ such that the distribution of patterns, and a random vari- 1 2 1 2 folgen aus ‘to follow from‘, angehören ‘to gold standards we calculated both accuracy and ϕ1 = αβ1γδ1 and ϕ2 = αβ2γδ2, able AA B classifying possible form for A on belong to‘, übereinstimmen ‘to agree‘, betreffen Cohen’s kappa. The latter measure considers the ∼ where β and β have the same length; the basis of the patterns they could possibly in- 1 2 ‘to concern’, abweichen ‘to deviate’ , verhindern number of classes which differ in the two gold segments in β and β (resp. δ and stantiate. For instance, going back to the data 1 2 1 ‘to prevent’ standards and, in addition, gives the significance δ2) match in category (vowel vs. con- in Table 1, INF PST.PTCP partitions the set of levels. ∼ sonant), starting from the left; and The classification into classes of concrete lexical pairs in 5 subsets corresponding to the patterns Taking the classification with five aspectual the length of α is maximal. Clas- properties which we induced from the co- verbs classes as gold standard the subject feature Xe Xe, XiK Xi, XiK Xy, XKiK XEK and ∼ ∼ ∼ ∼ sify the pair as instantiating pattern occurrence data bank (see above) is given below clearly outperforms the remaining features with XuKiK XOK, while INFINF PST.PTCP partitions the ∼ ∼ [Xβ1Y δ1 Xβ2Y δ2 / α γ ]. (the class labels are compatible with .857 accuracy (which means that 30 of 35 verbs set of infinitive forms in 4 sets, depending on ∼ Schumacher’s labels and are assigned using our b. For all patterns instantiat- were classified correctly) and k = .812. Kappa whether they end in -e, -uKiK, -V KiK with V = u, linguistic intuitions; class 10 is incoherent and values above .61 are characterized as substantial, ̸ ing the same alternation or -XiK with X = K. could not be labelled): above .81 as almost perfect agreement and ̸ [x y / α1 γ1 ],..., [x y / αn γn ], H(A B AA B), the conditional entropy of ∼ ∼ therefore highly significant. The combinations ∼ | ∼ determine maximally specific feature de- the pattern relating A and B given relevant fea- 1. verbs of activities manipulating a substance subject-direct object-prepositional object- scriptions of sets of strings α1, . . . , αn tures of the form filling A, evaluates how well A { } (normally with a tool): aspectual features and subject-direct object- predicts B. 3The problem can be presented as that of finding, for any hämmern ‘to hammer’, schneiden ‘to cut’, aspectual features yield .828 accuracy, k = .775 set of pairs of forms, a minimal set of subsequential finite- zersägen ‘to saw into pieces’ and k = .773, respectively. The combinations Crucial to this computation is the choice of state transducers such that one of the transducers maps each 2. verbs of consumption: subject-prepositional object-aspectual features, a strategy of exhaustive classification of patterns input form to the correct output. Even if that problem were aufessen ‘to eat up’, essen ‘to eat‘ solved, it is entirely possible for there to be more than one subject-direct object-prepositional object and of alternation between pairs of forms. Since the such minimal set, leading to competing classifications of the 3. verbs of difference, ‘negative’ processes, non- subject- aspectual features yield .8 accuracy each design of an algorithm finding an optimal such pairs and thus to different assessments of predictiveness. existence: with k = .741, k = .739 and k = .71, respectively.

6 179 and γ , . . . , γ , using (Albright, a form with a 3-way contrast of theme vowels, from the co-occurrence data bank (CCDB) of the by the TF-IDF measure and classified by cluster { 1 n} 1 2002)’s Minimal Generalization strat- such as the infinitive, and a form with stress on Institut für Deutsche Sprache (IDS). analyses carried out on a matrix with similarity values taken from the co-occurrence data bank egy. the prethematic vowel, such as the present 3SG. 2 Method This corresponds to the observation in (Bonami (CCDB) of the Institut für Deutsche Sprache Joint predictiveness can then be assessed look- (IDS).5 On the matrix of the similarity values, a and Lu´ıs, 2014) that such pairs of cells have com- We classified 35 common German verbs used by ing at joint random variables: predicting C from plementary predictive power. The sheer number Schumacher (1986), who defines seven lexical cluster analysis with Ward’s method and A and B is evaluated by (3): we assess the uncer- of alternative principal part systems highlights the semantic macrofields and 30 subfields. We chose Euclidean distance was carried out. According to the Bayesian Information Criterion there are two tainty associated with predicting both the pattern arbitrariness of the choice of a particular set of the verbs from all subfields, the only criterion relating A to C and the pattern relating B to C, being the representation of every subfield in optimal noun classes for all arguments. We principal parts (Matthews, 1972; Ackerman et al., interpreted the resulting noun classes using our given knowledge of relevant properties of A, rel- 2009; Blevins, forthcoming). order to cover the total semantic range of evant properties of B, and the pattern relating A Schumacher’s typology (1986). We checked the intuition thereby applying the criterion of Turning to French, we found no set of prin- and B. Notice that this easily generalizes to pre- frequency of the verbs in the first one million animacy (Croft, 2003; Aissen, 2003): The cipal parts of cardinality 2, as already observed n sentences containing at least one of our selected resulting two noun classes can be interpreted as diction given joint knowledge of different cells. by (Stump and Finkel, 2013). This is testament denoting predominantly animate and inanimate verbs of the web based 880 million word to the prevalence of erratic stem allomorphy in 2 things, respectively class 1 for instance, (3) H(A C,B C AA C ,BB C ,A B) SDEWAC corpus . The verbs of our test set [+animate] ∼ ∼ | ∼ ∼ ∼ French conjugation, leading to numerous situa- contains nouns such as Arzt ’doctor’ Lehrkraft occurred in more than one million sentences with Table 3 shows the average entropy from 1 or 2 tions of unpredictibility local to a small subpart of a mean frequency of approximately 30,000 ’teacher’ and class 2[-animate] contains nouns such as Entwicklung ’development’, Organisation cells for 5000 French verbs and 2000 European the paradigm (Bonami and Boye,´ 2002). However, occurrences per verb. 66 percent of the verbs was 4 ’organization’ and Wahrnehmung ’perception’. Portuguese verbs respectively. In both languages, this observation should be modalized in two ways. in the interval between 5,000 and 40,000 knowing a second cell significantly reduces uncer- First, our method yields 396 sets of principal occurrences, the more frequent outliers being The verbs’ vectors consist of areas for each tainty on average. parts of cardinality 3, whereas (Stump and Finkel, müssen ‘to must’ with 500,965 and halten für ‘to argument type. There are four areas in total and take so./sth. for so./sth.’ with 123,595 each area is split into areas for each noun class as 2013) found no set of cardinality smaller than 5. occurrences. We added five verbs; hämmern ‘to is depicted in (1): # of predictor cells French Portuguese This difference seems to be due to the fact that, under the methodology used here, the applicabil- hammer’, schneiden ‘to cut’, aufessen ‘to eat up’, 1 0.1670 0.1649 ity of a pattern of alternation is sensitive to phono- laufen ‘to walk/to run’, and zersägen ‘to saw into 2 0.0540 0.0818 tactic properties of the stem (thanks to the use pieces’ since these verbs since a previous study (Richter and van Hout 2015) showed (i) that . of the Minimal Generalization strategy in (2b)), Table 3: Average conditional entropy when pre- laufen ‘to walk/to run’ and zersägen ‘to saw into . dicting from 1 or 2 cells whereas (Stump and Finkel, 2013) only look at ex- . pieces’ are typical activity and accomplishment ponence. Arguably then, the present method pro- verbs respectively and (ii) that aufessen ‘to eat vides a superior evaluation of the diagnostic value = up’ is a typical accomplishment with an affected 4 Principal part systems of paradigm cells. subject verb. Schneiden ‘to cut’ and hämmern ‘to . Second, although there is no pair of cells with hammer’ were ambiguous (Richter and van A system of principal parts is a set of paradigm . categorical diagnostic value, some come very Hout, 2015), but we decided to classify in this cells such that knowledge of the forms filling . close. There are 25 pairs of cells (among which study the former as accomplishment and the these cells is sufficient to derive the rest of the pairs of very frequent cells such as the present latter as a process verb. paradigm (Hockett, 1967; Matthews, 1972; Finkel 3PL and the infinitive) such that predicting any In order to determine the verbs’ arguments we ( : ℎ ) and Stump, 2007; Stump and Finkel, 2013).5 The other cell from this pair yields an entropy below parsed at most 30.000 sentences per verb using validity of a principal part system thus rests on 0.005. This means that given knowledge of these the Mate-Tools dependency-parser (Bohnet, Figure 1. Dimensions of verb vectors: Weighted verbs the existence of systematic categorical joint pre- 3 in noun class areas. two cells, trying to guess any other cell will be 2010) . The whole code we used for filtering and dictiveness; and the evaluation method outlined in parsing the sentences, and aggregating the about as hard as predicting an event with a 99.95% the preceding section may be used to infer sets of actants and aspectual features (see below) is In addition, the vectors were completed by probability of occurrence.6 This casts doubts both 4 principal parts. available at GitHub. The 35 verbs of our test set aspectual features that Vendler (1967) suggested on the pedagogical value of categorical principal Exploring this issue on the European Por- (Richter and van Hout, 2015) are represented as in order to distinguish aspectual verb classes. part systems and on the usefulness of principal tuguese dataset, we find that there are 177 such 139 dimensional vectors containing the 30 most The aspectual features indicate, for instance, part systems, as opposed to graded evaluations of systems for Portuguese. All these systems include frequent nouns in the verbs argument positions: whether the verbs occur in sentences with joint predictiveness, for the study of morphologi- subjects, direct objects, indirect objects and temporal specifications of duration or a limited 4 The French dataset was extracted from Flexique cal competence. prepositional objects. The nouns were weighted time span with prepositions in and for, (Bonami et al., 2014). The Portuguese dataset was derived respectively, as in he wrote the letter in an hour from the University of Coimbra pronunciation dictionary Acknowledgments 1 http://corpora.ids-mannheim.de/ccdb/. The similarity versus he wrote the letter for an hour, whether (Veiga et al., 2012) for the purpose of (Bonami and Lu´ıs, 2013). values were provided by Cyril Belica. the verbs can be embedded by matrix verbs such 5 This work was partially supported by a public 2 We focus here on traditional static principal part sys- The SdeWaC Corpus is available at the WaCky as persuade or whether they occur in imperative grant overseen by the French National Research Corporadownload page at tems. See (Bonami and Boye,´ 2007; Finkel and Stump, 2007; forms. In order to classify the 35 verbs we used a Stump and Finkel, 2013) for alternative formulations of the http://wacky.sslmit.unibo.it/doku.php?id=corpora notion of principal part where different sets of paradigm cells 6If X is a binary random variable one of whose values has 3 See https://code.google.com/p/mate-tools/ serve as predictor depending on the lexeme. a probability of 0.9995, H(X) > 0.0062. 4 https://github.com/spinfo/verbclass 5 The similarity values were provided by Cyril Belica.

178 7 Agency (ANR) as part of the “Investissements [Bonami et al.2014] Olivier Bonami, Gauthier Caron, d’Avenir” program (reference: ANR-10-LABX- and Clement´ Plancq. 2014. Construction d’un Classification of German verbs using nouns in argument positions and 0083). lexique flexionnel phonetis´ e´ libre du franc¸ais. In Franck Neveu, Peter Blumenthal, Linda Hriba, An- aspectual features nette Gerstenberg, Judith Meinschaefer, and Sophie Prevost,´ editors, Actes du quatrieme` Congres` Mon- Michael Richter Jürgen Hermes References dial de Linguistique Franc¸aise, pages 2583–2596. Radboud University Nijmegen University of Cologne

[Ackerman and Malouf2013] Farrell Ackerman and [Finkel and Stump2007] Raphael Finkel and Gregory T. [email protected] [email protected] Robert Malouf. 2013. Morphological organization: Stump. 2007. Principal parts and morphological the low conditional entropy conjecture. Language, typology. Morphology, 17:39–75. 89:429–464. [Hockett1967] Charles F. Hockett. 1967. The Yawel- [Ackerman et al.2009] Farrell Ackerman, James P. mani basic verb. Language, 43:208–222. Blevins, and Robert Malouf. 2009. Parts the aspect-based classification of Richter and van and wholes: implicative patterns in inflectional [Kilani-Schoch and Dressler2005] Marianne Kilani- Abstract Hout (2015) into five classes which extends the paradigms. In James P. Blevins and Juliette Blevins, typology of Vendler (1967), i.e. Schoch and Wolfgang Dressler. 2005. Morphologie This paper provides evidence that editors, Analogy in Grammar, pages 54–82. Oxford naturelle et flexion du verbe franc¸ais. Gunter Narr accomplishments, achievements, states and aspectual verb classes (Vendler, 1967) University Press, Oxford. Verlag, Tubingen.¨ activities by the class accomplishments with an can be induced from nominal fillers in affected subject. [Albright2002] Adam C. Albright. 2002. The Identifi- [Mateus and d’Andrade2000] Maria Helena Mateus argument positions and aspectual This classification into five aspectual verb cation of Bases in Morphological Paradigms. Ph.D. and Ernesto d’Andrade. 2000. ThePhonology of features. We classified 35 German verbs thesis, University of California, Los Angeles. classes was derived by combining two user based Portuguese. Oxford University Press, Oxford. in a supervised learning procedure using classifications induced by cluster analyses from a support vector machine classifier and a [Baroni et al.2009] Marco Baroni, Silvia Bernardini, [Matthews1972] P. H. Matthews. 1972. Inflectional raters’ judgments and associations with stimulus Adriano Ferraresi, and Eros Zanchetta. 2009. The classification into five aspectual classes Morphology. A Theoretical Study Based on Aspects verbs and two usage based classifications wacky wide web: A collection of very large lin- of Latin Verb Conjugation. Cambridge University (Richter and van Hout, 2015) as gold induced from corpus data (Richter and van Hout, guistically processed web-crawled corpora. In Lan- Press, Cambridge. standard and observed excellent and guage Resources and Evaluation, volume 43, pages 2015). We took this classification as gold substantial agreements. 209–226. [New et al.2007] Boris New, Marc Brysbaert, Jean standard as we were interested in the correlation Veronis, and Christophe Pallier. 2007. The use of of the semantics of the nominal fillers in [Blevinsforthcoming] James P. Blevins. forthcoming. 1 Introduction film subtitles to estimate word frequencies. Applied argument positions of verbs and the aspectual Word and Paradigm Morphology. Oxford Univer- Psycholinguistics, 28:661–677. sity Press, Oxford. This study aims to empirically validate aspectual properties of verbs thereby following Klein [Sagot2010] Benoˆıt Sagot. 2010. The Lefff, a freely verb classes in German using large corpus data. (2009) who defines aspect as a grammatical [Bonami and Boye2002]´ Olivier Bonami and Gilles available and large-coverage morphological and Siegel (1997) and Siegel and McKeown (2000) category of verbs. Boye.´ 2002. Suppletion and stem dependency in in- syntactic lexicon for French. In Proceedings of induced the two aspectual classes states and In the present study we represent verbs as vectors flectional morphology. In Franck Van Eynde, Lars LREC 2010. that consist of nouns in argument positions Hellan, and Dorothee Beerman, editors, The Pro- events in the frame of a vector space model from ceedings of the HPSG ’01 Conference, pages 51–70. [Simsforthcoming] Andrea Sims. forthcoming. Inflec- corpora, however an induction of the complete separated into areas according to their noun CSLI Publications, Stanford. tional defectiveness. Cambridge University Press, Vendlerian typology has not yet been classes, which were induced by cluster analyses Cambridge. undertaken. We hypothesize that aspectual verb from similarity data. In addition, we added [Bonami and Boye2007]´ Olivier Bonami and Gilles classes can be automatically induced from the aspectual features as defined by Vendler (1967) Boye.´ 2007. Remarques sur les bases de la conju- [Stump and Finkel2013] Gregory T. Stump and classified nominal fillers in the argument to the vectors in order to compare the predictive gaison. In Elisabeth Delais-Roussarie and Laurence Raphael Finkel. 2013. Morphological Typology: Labrune, editors, Des sons et des sens, pages 77–90. From Word to Paradigm. Cambridge University position of verbs. Our hypothesis refers to the power of the noun classes in argument positions Hermes,` Paris. Press, Cambridge. Distributional Hypothesis (Rubenstein and against the predictive power of the aspectual Goodenough, 1965; Schütze and Pedersen, 1995; features, respectively. The test set of verbs was [Bonami and Boye2014]´ Olivier Bonami and Gilles [Thyme´ et al.1994] Ann Thyme,´ Farrell Ackerman, and Landauer and Dumais, 1997; Pantel, 2005) classified in a supervised learning procedure Boye.´ 2014. De formes en themes.` In Florence Vil- Jeff Elman. 1994. Finnish nominal inflection: which says that semantically related linguistic using a support vector machine (SVM) classifier. loing, Sarah Leroy, and Sophie David, editors, Foi- Paradigmatic patterns and token analogy. In Su- sonnements morphologiques. Etudes en hommage a` san D. Lima, Roberta Corrigan, and Gregory K. elements appear in semantically related contexts. In order to compare the results with aspectual Franc¸oise Kerleroux, pages 17–45. Presses Univer- Iverson, editors, The Reality of Linguistic . The present study in the framework of a vector verbs classes as gold standard with a gold sitaires de Paris Ouest. John Benjamins. space model is also driven by the Statistical standard-classification based on concrete Semantics Hypothesis (Weaver, 1955; Furnas et semantic categories compatible with [Bonami and Lu´ıs2013] Olivier Bonami and Ana R. [Veiga et al.2012] Arlindo Oliveira da Veiga, Sara Can- al., 1983; Turney and Pantel, 2010) which states Schumacher's typology (1986) of German verbs, Lu´ıs. 2013. Causes and consequences of complex- deias, and Fernando Perdigao.˜ 2012. Generating ity in portuguese verbal paradigms. In 9th Mediter- a pronunciation dictionary for european portuguese that linguistic meaning can be derived from we trained the SVM classifier with a ranean Morphology Meeting, Dubrovnik, septem- using a joint-sequence model with embedded stress statistic linguistic patterns. In order to test our classification based on ten verb classes which bre. assignment. Journal of the Brazilian Computer So- hypothesis, we took a test set of verbs from comprises classes such as verbs of consumption ciety, 88. Schumacher (1986) and determined the nominal and verbs of handicraft working (Richter and van [Bonami and Lu´ıs2014] Olivier Bonami and Ana R. fillers and their classes in argument positions. Hout, 2015). This classification was induced Lu´ıs. 2014. Sur la morphologie implicative dans [Wurzel1984] Wolfgang Ulrich Wurzel. 1984. Flex- That is, in subject, direct, indirect, and la conjugaison du portugais : une etude´ quantitative. ionsmorphologie und Naturlichkeit.¨ Ein Beitrag Memoires´ de la Societ´ e´ de Linguistique de Paris, zur morphologischen Theoriebildung. Akademie- prepositional object positions by parsing a very 22:111–151. Verlag, Berlin. Translated as (Wurzel, 1989). large German corpus. As gold standard we used

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

8 177 Acknowledgments [Medin et al.1997] Douglas L. Medin, Elizabeth B. Lynch, [Wurzel1989] Wolfgang Ulrich Wurzel. 1989. Inflec- John D. Coley, and Scott Atran. 1997. Categorization and tional Morphology and Naturalness. Kluwer, Dor- This research has been supported by an ARC grant reasoning among tree experts: Do all roads lead to rome? drecht. DE140101749 awarded to SDD. SV is a postdoctoral fellow Cognitive psychology, 32(1):49–96. at the Research Foundation - Flanders. A longer version of this work was also submitted to the 37th Annual meeting of [Moors et al.2012] Agnes Moors, Jan De Houwer, Dirk the Cognitive Science Society, Pasadena, 2015. We wish to Hermans, Sabine Wanmaker, Kevin van Schie, Anne- express our gratitude to Dan Navarro and Amy Perfors, who Laura Van Harmelen, Maarten De Schryver, Jeffrey De contributed to the longer version of this work. Winne, and Marc Brysbaert. 2012. Norms of valence, arousal, dominance, and age of acquisition for 4,300 dutch words. Behavior research methods, pages 1–9. References [Niedenthal et al.1999] Paula M. Niedenthal, Jamin B. Hal- [De Deyne et al.2013] Simon De Deyne, Daniel J. Navarro, berstadt, and Ase˚ H. Innes-Ker. 1999. Emotional re- and Gert Storms. 2013. Better explanations of lexical sponse categorization. Psychological Review, 106(2):337. and semantic cognition using networks derived from con- tinued rather than single word associations. Behavior Re- [Rosch1973] Eleanor Rosch. 1973. Natural categories. Cog- search Methods, 45:480–498. nitive Psychology, 4:328–350.

[De Deyne et al.2014] Simon De Deyne, Wouter Voorspoels, [Ruts et al.2004] Wim Ruts, Simon De Deyne, Eef Ameel, Steven Verheyen, Daniel J. Navarro, and Gert Storms. Wolf Vanpaemel, Timothy Verbeemen, and Gert Storms. 2014. Accounting for graded structure in adjective cat- 2004. Dutch norm data for 13 semantic categories and 338 egories with valence-based opposition relationships. Lan- exemplars. Behaviour Research Methods, Instruments, guage and Cognitive Processes, 29(5):568–583. and Computers, 36:506–515.

[De Deyne et al.in press] Simon De Deyne, Steven Verheyen, [Samsonovic and Ascoli2010] Alexei V. Samsonovic and and Gert Storms. in press. The role of corpus-size and Giorgio A Ascoli. 2010. Principal semantic components syntax in deriving lexico-semantic representations for a of language and the measurement of meaning. PloS one, wide range of concepts. Quarterly Journal of Experimen- 5(6):e10921. tal Psychology. [Sharp et al.1979] Donald Sharp, Michael Cole, Charles Lave, Herbert P Ginsburg, Ann L Brown, and Lucia A [Gentner and Kurtz2005] Dedre Gentner and Kenneth J. French. 1979. Education and cognitive development: The Kurtz. 2005. Relational categories. In W. K. Ahn, R. L. evidence from experimental research. Monographs of the Goldstone, B. C. Love, A. B. Markman, and P. W. Wolff, society for research in child development, pages 1–112. editors, Categorization inside and outside the lab., pages 151–175. American Psychology Association. [Wisniewski and Bassok1999] Edward J. Wisniewski and M. Bassok. 1999. What makes a man similar to a tie? [Gentner and Kurtz2006] Dedre Gentner and Kenneth J. Cognitive Psychology, 39:208–238. Kurtz. 2006. Relations, objects, and the composition of analogies. Cognitive Science, 30:609–642.

[Hutchison2003] Keith A. Hutchison. 2003. Is semantic priming due to association strength or feature overlap? Psychonomic Bulletin and Review, 10:785–813.

[Lancichinetti et al.2011] Andrea Lancichinetti, Filippo Radicchi, Jose´ J Ramasco, and Santo Fortunato. 2011. Finding statistically significant communities in networks. PloS one , 6(4):e18961.

[Lin and Murphy2001] Emilie L. Lin and Gregory L. Murphy. 2001.Thematic relations in adults’ concepts. Journal of Experimental Psychology: General, 1:3–28.

[Lopez et al.1997] Alejandro Lopez, Scott Atran, John D Co- ley, Douglas L Medin, and Edward E Smith. 1997. The tree of life: Universal and cultural features of folkbio- logical taxonomies and inductions. Cognitive psychology, 32(3):251–295.

[McRae et al.2005] Ken McRae, George S Cree, Mark S Sei- denberg, and Chris McNorgan. 2005. Semantic feature production norms for a large set of living and nonliving things. Behavior Research Methods, 37:547–559.

[Medin and Rips2005] Douglas L. Medin and Lance J. Rips. 2005. Concepts and categories: memory, meaning, and metaphysics. In K. Holyoak and R. Morrison, editors, The Cambridge Handbook of Thinking and Reasoning, pages 37–72. Cambridge University Press, Cambridge, UK.

176 9 Of crowds and corpora: A marriage of measures Table 3: Top 5 false positives ordered by cluster in-strength per category. Most of the false positives are thematic in nature. For instance, false positives for BIRDS include beak, egg, nest, and whistle.

Emmanuel Keuleers, Paweł Mandera, Michaël Stevens, & Marc Brysbaert Category 1 2 3 4 5 Department of Experimental Psychology FRUIT fruit juicy pit pick summer Ghent University VEGETABLES vegetable healthy puree sausage hotchpotch Henri Dunantlaan 2, 9000 Gent, Belgium BIRDS bird beak nest whistle egg INSECTS insect vermin beast crawl animal {emmanuel.keuleers, pawel.mandera, FISH fish fishing rod slippery water michael.stevens, marc.brysbaert}@ugent.be MAMMALS rodent gnaw tail pen marten REPTILES reptile scales animal tail amphibian CLOTHING clothing fashion blouse collar zipper KITCHEN UT. cooking kitchen stove cooker hood burning MUSICAL INSTR. wind instrument to blow fanfare orchestra harmony TOOLS tools carpenter carpentry wood drill VEHICLES speed drive vehicle motor circuit Abstract fewer of them. Following this reasoning, the WEAPONS sharp stab blade point stake estimate of the number of language users We discuss the relationship between a who know a word, or word prevalence may word's corpus frequency and its preval- a constrained version of the word association task, However, the fact that the only way to do so is to ence –the proportion of people who know give a better indication of occurrence than and the key difference is the number of thematic mimic all the restrictive characteristics of a fea- the word– and show that they are com- corpus frequency counts. responses one gets in both procedures. Similarly, ture generation task (e.g., limited word set) is re- plementary measures. We show that feature generation stimuli are usually restricted to vealing. Taxonomic information is not the primary adding word prevalence as a predictor of 1.2 Where the corpus is strong the concrete nouns, which places restrictions on what means by which the mental lexicon is organized: lexical decision reaction time in the crowd is weak words can be grouped together. In other words, if it were, we should not have to resort to such Dutch lexicon project increases explained the tendency to find taxonomic categories may be drastic restrictions in order to uncover taxonomic variance by more than 10%. In addition, On the other hand, consider presenting the a result of restricting the task. categories. we show that, for the same dataset, word same random sample of people with words To test this idea, we used the word associa- In summary, even at the most detailed level of prevalence is the best independent pre- from the language's core vocabulary. Since tion data to construct a network that included only dictor of word processing time. the hierarchy, only limited evidence for a taxo- these words will be known to all of the those 588 words that belonged to one of the tax- nomic view along the lines of Rosch was found, judges, prevalence will be singularly high 1 Introduction onomic categories. Moreover, in order to ap- even for typical taxonomic domains like animals. and uninformative. In this case corpus counts proximate the “shared features” measure that is These results suggest that in much of the previ- Word frequency is one of the most important should be a much better estimate of occur- more typical of feature generation tasks, we com- ous work the pervasive contribution of affective measures in the cognitive study of word pro- rence. puted the cosine similarity between pairs of words. and thematic or relational knowledge structuring cessing, both theoretically and methodologically. That is, words that have the same associates are might be overlooked by a selection bias in terms Its contribution in explaining behavioural meas- 2 Testing the prevalence measure deemed more similar, and this similarity was used of the concepts (nouns, mostly concrete) and se- ures such as reaction time is so large that re- 2 to weight the edges in the restricted network. We mantic relations (predominantly taxonomic). This searchers take great care in collecting large and To test the complementarity of prevalence then applied the clustering procedure to this re- finding is in line with previous results indicat- reliable corpora and in applying the best possible and frequency as measures of occurrence, we stricted network and repeated the analysis from the ing that network derived similarity estimates ac- word frequency estimates in their research. used prevalence norms for Dutch collected previous section. The F-statistics from this analy- count better for human thematic relatedness judg- through a lexical decision task presented as 1.1 Where the corpus is weak the crowd sis are reported as the F0-values in Table2. This ments than for taxonomic relatedness judgments an online vocabulary test (Keuleers, Stevens, is strong time, the results of the clustering show a high de- (De Deyne et al., in press). In priming studies, Mandera, & Brysbaert, in press). Each par- gree of agreement with the taxonomic organiza- the dominance of thematic over taxonomic struc- ticipants saw 100 stimuli (about 70 words A drawback of frequency counts is that, re- tion, with an average F-value of 0.79. The only ture can also explain facilitation when thematic but and 30 nonwords) selected randomly from a REPTILES gardless of corpus size, lower counts are un- exception was , which upon inspection not coordinate prime-target pairs are used (Hutchi- list of 54,319 words and 21,734 nonwords. REPTILES reliable. As an example, consider asking a appears to reflect a failure to distinguish son, 2003). Finally, our findings converge with re- In the current analysis, we used the data of INSECTS random sample of 100 people whether they from . cent evidence that highlights the role of thematic 190,771 participants who indicated that they The success of this analysis suggests two things. know each of the word types that occur just representations even in domains such as animals were living in Belgium, giving us about 250 First, the word association task does encode taxo- once in a large corpus. Although frequency (Gentner and Kurtz, 2006; Lin and Murphy, 2001; observations per word. The score for a word nomic information, as evidenced by the fact that for all these types is equal, the number of Wisniewski and Bassok, 1999) whereas previous obtained by fitting a Rasch model –a mathe- we are able to reconstruct taxonomic categories. judges knowing each word will vary from reports that have stressed taxonomic organization matical model simultaneously ranking partic- 2 zero to one hundred and, as the judges are Note that one could also derive such a similarity-based might be more exceptional as they are heavily cul- ipants by ability and test-items by difficulty– network for the complete lexicon, which would reflect the language users, words known to many of turally defined (Lopez et al., 1997), a consequence to the data was considered an operationaliza- similarity between cues rather than their weighted associative them may be considered to occur more often strength. We did in fact do this. It produced similar results to of formal education (Sharp et al., 1979), or reflect tion of its prevalence. in language than words which are known by the original analysis. different levels of expertise (Medin et al., 1997).

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

10 175 2 Evaluating Taxonomic Structure Table 2: F-values and cluster sizes for items gen- icon ProjectTable 1 shows that the correla- To test whether the clusters provide evidence for erated for 13 concrete noun categories. Nhuman is tion between prevalence and frequency was a hierarchical taxonomic view along the lines of the category size based on the exemplar genera- relatively low (.34), giving further evidence Rosch and colleagues (Rosch, 1973) or support an tion task; Nc is the size of the best-matching clus- that prevalence is distinct from word fre- alternative view based on thematic relations iden- ter; F captures precision and recall according to quency and contextual diversity –a word's tified in the previous section, data from an exem- the human categories for the full network. F0 is document count– which correlates very plar generation task from Ruts et al. (2004) was calculated from a network that excluded potential highly with word frequency. used. In this task, 100 participants generated as thematic information. F-values are fairly low, in- many exemplars they could think of for six ar- dicating lack of correspondence between the clus- Finally, we used the data from the 7,885 tifact categories (CLOTHING,KITCHEN UTEN- ters and the taxonomic categories. Excluding the- items in the Dutch Lexicon Project (Keuleers SILS,M USICAL INSTRUMENTS,TOOLS,VEHI- matic information results in F0 values that do cap- et al., 2010) for which both frequency and CLES, and WEAPONS) and seven natural kinds ture taxonomic information. prevalence were available to examine the RUIT EGETABLES IRDS N contributions of Dutch corpus word fre- categories (F ,V ,B ,I - Category Nhuman Nc F F0 SECTS,FISH,MAMMALS, and REPTILES). If the quency (SUBTLEX-NL, Keuleers et al., FRUIT 40 50 0.47 0.84 clusters in the word association network group to- VEGETABLES 35 58 0.50 0.90 2010) and word prevalence on average reac- gether different types of birds, vehicles, fruits, and BIRDS 53 63 0.53 0.90 tion times. so on, this would indicate a taxonomic organiza- INSECTS 40 34 0.46 0.68 FISH 37 48 0.57 0.91 In single variable analyses, log word fre- tion of semantic memory. For each category, we MAMMALS 61 21 0.20 0.76 Figure 1: The relationship between frequency and preval- quency explained about 36.13% of the vari- investigated the size of the best matching cluster REPTILES 21 22 0.65 0.51 ence. Word frequency is displayed as Zipf-score (log fre- quency per billion words; Van Heuven et al., 2014). ance in reaction times and prevalence ex- and calculated precision and recall in terms of the Mean 41 42 0.48 0.79 plained about 33.03% of the variance in re- F-measure for clustering performance. CLOTHING 46 70 0.35 0.80 action times. A taxonomic-like organization would be evi- KITCHEN UT. 71 18 0.20 0.66 Figure 1 shows the complementary relation dent in clusters with high precision and recall, re- MUSIC INSTR. 46 24 0.37 0.89 between the SUBTLEX-NL word frequen- This was also made clear when both mea- sulting from many true positives and few false pos- TOOLS 73 56 0.25 0.76 cies (based on 42 million word corpus of sures were considered in the same analysis, VEHICLES 46 28 0.16 0.73 itives and false negatives. For instance, if the clus- WEAPONS 46 25 0.37 0.88 film and television subtitles; see Keuleers, where both measures jointly explained 51.37 ter corresponding to the category BIRDS contained Brysbaert, & New, 2012) and the prevalence % of the variance in reaction times. The Mean 55 37 0.28 0.79 robin (a true positive) and did not contain spoon measure obtained from the online vocabulary unique contributions to explained variance (a true negative), that would increase the F-score. Inspecting the false positives for each of the test. Higher z-scores indicate more prevalent (eta-squared) were 27.39% for frequency and Conversely, if it contained guitar (a false positive) clusters in Table3 confirms the validity of the ap- words. The dark lines at the bottom half of 23.87% for prevalence. In further analyses, or did not contain ostrich (a false negative), that proach as in the majority of the cases the superor- the plot indicate words with singularly low we found that including the quadratic trend would decrease the F-score. This way, high F- dinate label (e.g., fruit, tools, etc.) was the most frequencies over a large range of prevalence. of word frequency and contextual diversity scores should reflect categories that are not overly central member of each cluster. The remaining The elongated cluster at the right side of the did not substantially alter this pattern of re- specific (many false negatives) or general (many intrusions were thematic in nature (e.g., FRUIT: plot shows words with nearly full prevalence sults. false positives). pick, BIRDS: nest), thus confirming our earlier ex- over large frequency ranges. On average, the best matching clusters were ploratory findings. 3 Conclusion found at Level 5. The results for each category are In addition, we investigated the relationship shown in Table2. The average number of mem- One potential response to the previous analyses between prevalence and other typical mea- The results show that, next to word fre- bers in the exemplar generation task was on av- relates to the nature of the data upon which they sures of word frequency. Table 1 gives an quency, prevalence is by far the most impor- erage 41 for the seven natural kinds categories, are based. Perhaps the word association task sim- overview of these correlations. tant independent contributor to visual word which is in the same range as the average best ply fails to capture taxonomic information, and if recognition times, suggesting that prevalence matching cluster size of 42. For artifacts the gener- so, the results of these analyses are simply an ar- should be included in any analysis where ated categories included on average 55 members, tifact of the choice of task. Alternatively, perhaps Frequency Prevalence OLD 20 Length word corpus frequency is considered to be which was somewhat larger than the obtained av- the “failure” arises because the word association Frequency 1.00 0.35 -0.34 -0.37 relevant. However, several questions remain erage cluster size of 37. task is more general than the tasks typically used Prevalence 0.35 1.00 0.00 0.07 open. First, what is the influence of corpus The resulting F-values were on average 0.48 for to study taxonomic categories. OLD20 -0.34 0.00 1.00 0.74 size on the relation between corpus word fre- the natural categories and 0.28 for the artifacts, in- There is some evidence that a different choice of Length -0.37 0.07 0.74 1.00 quency and prevalence and on the contribu- dicating only limited support for the presence of task would produce different results. For instance, Contextual 0.98 0.36 -0.34 -0.35 tion of prevalence to lexical processing? Sec- taxonomic categories. The highest values were ob- much of the work on taxonomic organization re- Diversity ond, how well does prevalence perform on tained for FISH (F = .57) and REPTILES (F = .65) lies on tasks in which participants are asked to list others tasks and in other languages? Finally, where most items in the clusters were true cate- features of entities (McRae et al., 2005; Ruts et al., Table 1: Correlations between main predict- does the effect of prevalence on word pro- gory members. 2004). One could argue that feature generation is ors of Lexical Decision RT in the Dutch Lex- cessing truly lie in a better measurement of

174 11 word occurrence or does it partly reflect an independent property associated with the learnability of a word?

Acknowledgments The text of this abstract is an early summary of find - ings from a larger study reported in the Quarterly Journal of Experimental Psychology as Word knowl- edge in the crowd: Measuring vocabulary size and word prevalence in a massive online experiment. (Keuleers, E., Stevens, M., Mandera, P., & Brysbaert, M., in press).

References Balota, D. A., Yap, M. J., Hutchison, K. A., Cortese, M. J., Kessler, B., Loftis, B., … Treiman, R. (2007). The English lexicon project. Behavior Re- search Methods, 39(3), 445–459. Keuleers, E., Brysbaert, M., & New, B. (2010). SUB- TLEX-NL: A new measure for Dutch word fre- quency based on film subtitles. Behavior Research M e t h o d s , 4 2( 3 ) , 6 4 3 – 6 5 0 . doi:10.3758/BRM.42.3.643 Figure 1: Hierarchical tree visualization of clusters in the lexicon with five most central members in Keuleers, E., Diependaele, K., & Brysbaert, M. terms of cluster in-strength. (2010). Practice Effects in Large-Scale Visual Word Recognition Studies: A Lexical Decision Study on 14,000 Dutch Mono- and Disyllabic increasingly more concrete. For instance, Level Table 1: Overview of the hierarchical cluster structure Words and Nonwords. Frontiers in Psychology, 1. 2 shows that the “negative” cluster in Level 1 in- doi:10.3389/fpsyg.2010.00174 showing five levels (Level 1 is broadest, Level 5 is most pre- cise). The statistics include total number of clusters N, av- cludes clusters with abstract words or words re- erage cluster size N and its standard deviation, number of Keuleers, E., Lacey, P., Rastle, K., & Brysbaert, M. h ci lated to human culture (school, money, religion, (2011). The British Lexicon Project: Lexical de- homeless nodes Nhomeless, number of nodes member of mul- tiple clusters N , and the average p-value p . time,...) which are now differentiated from a cision data for 28,730 monosyllabic and disyllabic overlapping h i purely negative cluster with central members like English words. Behavior Research Methods, 44(1), 1 2 3 4 5 287–304. doi:10.3758/s13428-011-0118-4 negative, sad, and crossed. The subdivisions of the N 2 7 37 161 506 “positive” cluster involve the central nodes nature, Keuleers, E., Stevens, M., Mandera, P., & Brysbaert, N 8588 3049 515 112 25 h ci music, sports, and food, which might be inter- M. (in press). Word knowledge in the crowd: Mea- sd(Nc) 2112 973 364 66 12 suring vocabulary size and word prevalence in a Nhomeless 18 18 39 86 380 preted as covering sensorial information and natu- massive online experiment. Quarterly Journal of Noverlapping 5943 6956 5263 4717 1676 ral kinds. p 0 0.062 0.04 0.035 0.051 Experimental Psychology. h i doi:10.1080/17470218.2015.1022560 are applicable to 3,642 non-overlapping words At the lowest level, 506 clusters were identi- in our clusters. The valence judgments differed fied, with an average size of 25 words. A total of significantly between our two clusters according 1,676 words occurred in multiple clusters; at least to an independent t-test (t(3640) = 7.367, CI = a part of them because of homonymy (e.g., bank) [0.190,0.327]). This post-hoc test confirmed our or polysemy (e.g., language, assigned to clusters interpretation of a valence difference between the about nationality, speech, language education, and clusters, which brings further support to studies communication). Most importantly, inspection of that indicated valence is the most important di- the content of all clusters exhibited a widespread mension in semantic space (De Deyne et al., 2014; thematic structure: the clusters were often com- Samsonovic and Ascoli, 2010) and empirical find- posed of both nouns (racket), adjectives (loud), ings highlighting affect-based category structure and verbs (to sound), which does not reflect a pure (Niedenthal et al., 1999). taxonomy of entitities, but also includes properties At Levels 2 to 4, the meaning clusters become and actions.

12 173 Using network clustering to uncover the taxonomic and thematic Perception of gesturally distinct consonants in Persian structure of the mental lexicon Reza Falahati Chiara Bertini Simon De Deyne Steven Verheyen Laboratorio di Linguistica Laboratorio di Linguistica University of Adelaide University of Leuven Scuola Normale Superiore Scuola Normale Superiore School of Psychology Department of Psychology Piazza dei Cavalieri 7 Piazza dei Cavalieri 7 5005 Adelaide, Australia Tiensestraat 102, 3000 Leuven, Belgium 56126 Pisa, Italy 56126 Pisa, Italy [email protected] [email protected] [email protected] [email protected]

While still influential, the view that concepts are and allows for overlapping clusters. Similar to alveolar gesture, some had gestural overlap that organized as a hierarchical taxonomy as proposed taxonomic theories of knowledge representation, Abstract masked at least some of the acoustic information by Rosch (1973) has been challenged on several words are grouped in progressively larger clusters, for [ ], and some had reduced alveolar gestures. occasions. For example, some studies have at- which allows us to evaluate structural properties This study explores the sensitivity of the t The current study tests listeners’ sensitivity to tributed a larger role to thematic relations (Gentner of the lexicon at different scales. This hierarchi- individuals to the residual gestures and Kurtz, 2005; Lin and Murphy, 2001), whereas cal structure is also derived from the data by using remaining after the simplification of these three types of /t/ realizations. consonant clusters. Three sets of target others have stressed the role of affect in structuring a statistical criterion that involves a comparison stimuli having full, reduced, and zero 2 Background word meaning (Niedenthal et al., 1999). A com- with an appropriate null-model for the weighted alveolar gestures along with the control prehensive account of how these different princi- directed graph. stimuli were used in a perceptual Choosing the basic units or building blocks by ples shape and structure meaning in the lexicon is Applying OSLOM to the semantic network re- identification task. The results of the which the phenomena in a discipline could be missing, and most studies continue to be biased sulted in a solution with five hierarchical levels. experiment showed that subjects reliably explained is fundamentally important. Due to the towards concrete noun categories that fit into hier- An overview of this solution is shown in Table 1. distinguished the three target sets with “complex” nature of language, there is no archical taxonomies (Medin and Rips, 2005). To There was a large degree of variability in the num- varying residual gestures from the consensus among linguists as to the nature of this capture mental or psychological properties that or- ber of clusters across the five hierarchical levels control. The results also showed that the basic unit in the field. The controversy over choosing the building blocks extends to the ganize the lexicon for a wide range of concepts ranging between 2 and 506 clusters. On aver- degree of residual gestures affects the domain of speech perception where different and semantic relations, we propose a large-scale age, the p-value of the extracted clusters was low rate of [ ] perception by the subjects; t models have postulated various basic units of semantic network derived from word associations across all levels, indicating that the obtained clus- however, this was not statistically processing and storage. as the basis to uncover what the structural princi- ters were unlikely to arise in a comparable random significant. The results are discussed in In general, there are two major theoretical 1 ples are. network . There were few homeless nodes at any the context of different theories of speech approaches to speech perception: gesturalist level, indicating that most words were reliably at- perception. theories versus auditory and exemplar theories. 1 Network Clustering tributed to a specific cluster. There was also a con- The two main gestural theories of speech 1 Introduction siderable degree of overlap at all levels relative to perception are Motor Theory and Direct Realism Since this is one of the first times the mental lex- icon is mapped in its entirety using an extremely the size of the clusters; clusters were more distinct This study investigates the perception of three (MT and DR, henceforth). In motor theories, the intended phonetic gestures of the speaker are extensive word association corpus, an exploratory at the more precise levels, where more clusters categories of consonant clusters that are considered to be the objects of speech approach is warranted. To achieve this, network were obtained. For instance, at the lowest level perceptually similar but gesturally distinct. In perception. These gestures are “represented in clustering was used as a way to study how the 1,676 words appeared in multiple clusters, com- Persian, word-final coronal stops are optionally pared to 5,943 at the highest level. deleted, when they are preceded by obstruents or the brain as invariant motor commands that call mental lexicon can be structured at different scales for movements of the articulators through certain Figure 1 illustrates the obtained clusters with the homorganic nasal /n/. For example, the final and what type of semantic relations dominate its linguistically significant configurations” structure. At the basis lies a semantic network de- the most prototypical examples of each cluster at clusters in the words /ræft/ “went”, /duχt/ “sew” (Liberman and Mattingly 1985, p. 2). The main various levels. At the most general level, Figure 1 1 rived from a large scale word association corpus and /qæsd/ “intension” are optionally simplified motivation for choosing such basic unit by MT, including over 12,000 cues and 3.77 million re- shows two distinct clusters, with one of them con- among other factors, is mainly because of in fast/casual speech, resulting in: [ræf], [duχ], sponses (De Deyne et al., 2013). For the purpose taining highly central words with a negative con- patterns where different acoustic cues could give of this study, non-dominant word forms were re- notation. In order to verify whether this interpreta- and [qæs], respectively. The articulatory study rise to the same phonetic percept or where moved (e.g., apples was removed if apple was also tion is supported statistically, we used the valence conducted on this process in Persian by Falahati variant phonetic percepts were found for the present) resulting in a network of 11,000 words. judgments reported by Moors et al. (2012), which (2013) has shown that the gestures of the deleted same synthetic speech across different contexts segments are often still present. More Next, the recent Order Statistics Local Optimiza- (Delattre et al., 1955, 1964; Liberman 1957; 1 specifically, the findings showed that of the tion Method (OSLOM) was applied to identify sta- Default parameters were used in the OSLOM algorithm, Liberman and Mattingly 1985). Despite of the except for the p cut-off value. Setting this value depends on clusters that sounded simplified, some had no fact that this theory has gone through significant tistically reliable clusters in a directed weighted the task as it affects the size of the clusters (Lancichinetti changes from its inception, all the versions share word associations network (Lancichinetti et al., et al., 2011). In this application, the cutoff was set at 0.25, 1 The term “simplification” is used here for the acoustic and because the few clusters in the final solution with high p- the idea that the objects of speech perception are perceptual consequence of apparent coronal consonant 2011). This method includes words in the final values were easy to interpret. Other values of p did not alter deletion, regardless of whether there is a residual articulatory events rather than acoustic or cluster solution on the basis of statistical criteria the general pattern of results we report here. articulatory gesture. auditory events.

Copyright c by the paper’s authors. Copying permitted for private and academic purposes. Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

172 13 An intended gesture is produced by a number complex stimuli with structured variance (Diehl Carita Paradis, Caroline Willners and Steven l’Institut de Linguistique de Lund, 40, Dept. of muscles that act in concert sometimes ranging et al., 2004). According to this approach, the Jones. 2009. Good and bad opposites: using of Linguistics, Lund University. over more than one articulator. For instance, phonological representations are assumed to be textual and psycholinguistic techniques to constriction needed for producing coronal stops speaker independent and they are associated with measure antonym canonicity. The Mental involves the action of the tip/blade of the tongue each word in the listener’s mental lexicon. The Lexicon, (4), 3: 380-429. and the jaw; however, such a constriction is proponents of this approach take, for example, Carita Paradis and Caroline Willners. 2011. An- considered one gesture. According to MT, the categorical perception of non-speech sounds or tonymy: from convention to meaning-making. orchestration among gestures is quite systematic categorical-like perception by non-human Review of Cognitive Linguistics, 9: 367-391. and listener s can use the systematically varying animals as evidence for their argument. They Francesca Pesciarelli, Tania Gamberoni, Fabio acoustic cues for coronal stops as information to also consider some of the cross-linguistic sound Ferlazzo, Leo Lo Russo, Francesca Pedrazzi, detect the related consonant gestures. patterns and the “maximal auditory dispersion” Ermanno Melati and Cristina Cacciari. 2014. MT assumes a biological link between in vowel systems as further support for their Is the comprehension of idiomatic sentences perception and production. According to this claim (Ohala 1990, 1995). indeed impaired in paranoid Schizophrenia? perspective both speech perception and speech Exemplar theories form another approach to A window into semantic processing deficits. production share the same set of invariants and speech perception where words and frequently- Frontiers in Human Neuroscience, 9 October are governed by auditory principles. “The used grammatical constructions are represented 2014, 8:799, doi:10.3389/fnhum.2014.00799. motivation for articulatory and coarticulatory in memory as large sets of exemplars containing Edith Pomarol-Clotet, Tomasina, M. Oh, Keith maneuvers is to produce just those acoustic fine phonetic information. Listeners are sensitive R. Laws and Peter J. McKenna. 2008. Seman- patterns that fit the language-independent to phonetic details existing in the speech signal. tic priming in schizophrenia: systematic re- characteristics of the auditory system” (Liberman In such a speech perception model, a mechanism view and meta-analysis. British Journal of and Mattingly, 1985, p. 6). The acoustic signal is needed for gradiently changing the lexical Psychiatry, 192: 92–97. only serves as a source of information about the representations over time. In order to do so, the Manfred Spitzer, Ursula Braun, Leo Hermle and gestures. It is the gestures which define the perceptual system must be capable of making Sabine Maier. 1993. Associative semantic phonetic category. fine phonetic distinctions (Johnson 1997). network dysfunction in thought-disordered The other main gestural theory to speech These different approaches to speech schizophrenic patients: direct evidence from perception is direct realism. Both DR and MT perception have been tested in different studies. indirect semantic priming. Biological Psychi- share the claim that listeners to speech perceive Beddor et al., (2013), for example, used eye- atry, 34: 864-877. vocal tract gestures. However, in DR it is the tracking to assess listeners' use of coarticulatory Debra Titone, Deborah L. Levy and Philip S. phonological gestures of the vocal tract, rather vowel nasalization as that information unfolded Holzman. 2000. Contextual insensitivity in than the intended gestures, which are the in real time. In the experiment, subjects heard the schizophrenic language processing: evidence perceptual objects (Fowler 1981, 1984, 1996). nasalized vowels with two different time from lexical ambiguity. Journal of Abnormal According to DR, “the temporal overlap of latencies. The prediction was that subjects will Psychology, 109: 761-767. vowels and consonants does not result in a fixate on the related image sooner when they Debra Titone, Philip S. Holzman and Deborah L. physical merging or assimilation of gestures; hear the nasalized vowel earlier. The results Levy. 2002. Idiom processing in schizophre- instead, the vowel and consonant gestures are showed that listeners use relevant acoustic cues, nia: literal implausibility saves the day for id- coproduced. That is, they remain, to a which was argued to allow listeners to track the iom priming. Journal of Abnormal Psycholo- considerable extent, separate and independent gestural information. Nalon (1992) in an gy, 111: 313-320. events...” (Diehl et al., 2004, p. 153). If we could identification task tested whether participants Debra Titone, Maya Libben, Meg Niman, Laris- extend this to the gestures of two adjacent could identify different degrees of velar sa Ranbom and Deborah L. Levy. 2007. Con- consonants, one should expect that the gestures assimilation. He used four different articulation ceptual combination in schizophrenia: con- related to them also remain separate and distinct types called full alveolar, residual alveolar, zero trasting property and relational interpreta- from each other. alveolar (i.e., full assimilation to the following tions. Journal of Neurolinguistics, 20: 92– In contrast to gestural theories, the auditory velar), and nonalveolar (i.e., velar in underlying 110. theories assume that speech sounds are perceived representation). The results of his study showed Joost van de Weijr, Carita Paradis, Caroline via general cognitive and learning mechanisms. that the participants perceived full alveolar Willners and Magnus Lindgren. 2014. Anto- In this view, speech is not special and listeners tokens with 100% accuracy with /d/ responses nym canonicity: temporal and contextual ma- do not perceive gestures. The auditory approach while less than half the tokens with residual nipulations. Brain and Language, 128: 1-8. to perception mainly considers general auditory Sophia Vinogradov, John H. Poole, Jason Willis- alveolar were identified with / / responses. In mechanisms responsible for perceptual d Shore, Beth A. Ober and Gregory K. Shenaut. performance. According to this view, the speech another study, Pisoni showed that the nonspeech 1998. Slower and more variable reaction and nonspeech stimuli do not invoke a special or analogs of VOT stimuli are perceived times in schizophrenia: what do they signify? speech-specific module. Gestures have no categorically. Similar studies like this were taken Schizophrenia Research, 32: 183-190. mediatory role as to the perception of speech as evidence against MT which claimed Caroline Willners. 2001. Antonyms in Context. sounds in this approach. Listeners use multiple categorical perception as a specific feature of the A Corpus-based Semantic Analysis of Swe- imperfect acoustic cues in order to categorize the speech mode of perception. dish Descriptive Adjective. Travaux de

14 171 mally dissimilar meanings (Paradis and Willners, Philip D. Harvey. 2010. Cognitive functioning In this study, I will use three sets of simplified Target Full_G: [æχtt kɑ], [æftt bæ], [uftt bɑ] 2011). This ability to a large extent relies on pre- and disability in schizophrenia. Current Di- consonant clusters which are auditorily similar Target Partial_G: [ t ], [ t ], [ t ] served executive resources, integrity of the se- rections in Psychological Science, 19: 249- but gesturally different. The consonant clusters æχ kɑ æf bæ uf bɑ mantic processing system and size of the lexicon. 254. (i.e., C1C2#) happen in the coda of the words Target Zero_G: [æχ kɑ], [æf bæ], [uf bɑ] Earl Hunt. 1977. We know who knows, but why? followed by another word which also starts with Control: [æχ ke], [æf bæ], [uf bɑ] References In R. C. Anderson, R. J. Spiro & W. E. Mon- a consonant, therefore giving us three consonants tague (eds) Schooling and the acquisition of in a row in an intervocalic environment (i.e., The four sets of target and control nonwords Deanna, M. Barch and Alan Ceaser. 2012. Cog- knowledge, 327-333. Hillsdale NJ: Laurence V1C1C2#C3V2). The prediction is that if subjects presented above are the excised tokens taken nition in schizophrenia: core psychological Erlbaum. are sensitive, they should have different from the full words presented below: and neural mechanisms. Trends in Cognitive Hyeon-AE Jeon, Kyoung-Min Lee, Young-Bo judgment for the stimuli. The stimuli set with no Sciences, 1: 27-34. Kim and Zang-Hee Cho. 2009. Neural sub- coronal gesture is expected to show the same Target: /sæχt kɑr/ “hard-working”, /næft Ivana Bianchi, Ugo Savardi and Michael Ku- strates of semantic relationships: common and pattern as the control (with zero coronal gesture bovy. 2011. Dimensions and their poles: a distinct left-frontal activities for generation of in the underlying representation). The stimuli bærɑje/ “oil for”, /kuft bɑʃeh/ “be cheap” metric and topological theory of opposites. synonyms vs. antonyms. NeuroImage, 48: with overlapped gestures and reduced gestures Language and Cognitive Processes, 26: 1232- 449-457. are predicted to show a pattern different both Control: /næχ ke/ “thread that”, /sæf bærɑje/, 1265. Steven Jones. 2002. Antonymy: A corpus-based from control and the stimuli with zero residual “cue for” / mæruf bɑʃeh / “be famous” Stephen Blumberg and Donald W. Giller. 1965. perspective. London: Routledge. gestures. The following section introduces the

Some verbal aspects of primary-process Michael Kiang. 2010. Schizotypy and language: methodology of the study. thought: a partial replication. Journal of Per- a review. Journal of Neurolinguistics, 23: 3.3 Procedure sonality and Social Psychology, 95: 517-520. 193-203. 3 Methodology Gina R. Kuperberg. 2010a. Language in Schiz- All the participants listened to forty stimuli (10 Gildas Brébion, Rodrigo A. Bressan, Ruth I. 3.1 Participants Ohlsen, Lyn S. Pilowsky and Anthony, S. ophrenia. Part 1. An Introduction. Language stimuli in each category) with eight repetitions. David. 2010. Production of atypical category and Linguistic Compass (4), 8: 576-589. Thirty-two Persian-speaking students from the (total of 320 tokens) in a sound booth located at exemplars in patients with schizophrenia. Gina R. Kuperberg. 2010b. Language in Schizo- Università di Pisa and Sant’Anna, seventeen the linguistics laboratory in Scuola Normale Journal of the International Neuropsycholog- phrenia. Part 2. What psycholinguistics bring females fifteen males, aged 18-38 participated Superiore. The software Presentation was used to present the stimuli to the listeners as an ical Society , (16), 5: 822-828. to the study of schizophrenia… and vice ver- in this study. The results of eight of them are not Alvin G. Burstein. 1961. Some verbal aspects of sa? Language and Linguistic Compass (4), 8: considered for analysis because they reported to identification task. The participants were asked primary process thought in schizophrenia. 590-604. be bilinguals and mainly used a language rather to listen very carefully and decide as quickly as The Journal of Abnormal and Social Psy- Deborah L. Levy, Micheal J. Coleman, Heejong than Persian at home or with their close friends. possible whether it is likely that there has been a chology, 62: 155-157. Sung, Fei Ji, Steven Matthysse, Nancy R. This resulted in twenty-four, twelve females [t] at the end of the first part of each stimuli. For Cristina Cacciari, Francesca Pesciarelli, Tania Mendell and Debra Titone. 2010. The genetic twelve males. None of them reported any hearing each stimulus, the participants were asked to Gamberoni Fabio Ferlazzo, Leo Lo Russo, basis of thought disorder and language and problem. press either the green or the blue button on a , Cedrus response pad. On the screen of a Francesca Pedrazzi, and Ermanno Melati . communication disturbances in schizophre- 2015. Is black always the opposite of white? nia. Journal of Neurolinguistics, 23: 176-192. computer, listeners could also see “T” or “NO T” 3.2 Stimuli An investigation on the comprehension of Muriel D. Lezak, Diane B. Howieson and David corresponding to the response buttons. The antonyms in people with schizophrenia and in W. Loring. 2004. Neuropsychological as- Three sets of target words varying in only the stimuli were shuffled and presented in blocks in healthy participants. Behavioral Sciences, sessment (4th Edition). Oxford: Oxford Uni- degree/amount of alveolar residual gestures and a way that participants could either begin by 5:93-112. versity Press. one control stimuli set were used in the hearing all the tokens with [f] or [χ]. They also Jonathan, D. Cohen, Deanna, M. Barch, Camer- Michael J. Minzenberg, Beth A. Ober and Sophia experiment. The three target categories are had the choice of taking a break after listening to on Carter and David Servan-Schreiber. 1999. Vinogradov. 2002. Semantic priming in mainly the same except for the degree of alveolar every 80 tokens. All the participants received a Context -processing deficits in schizophrenia: schizophrenia: a review and synthesis. Jour- residual gestures. Target Full_G category has full short training before the start of the experiment. converging evidence from three theoretically nal of the International Neuropsychological coronal gesture but has overlap hence marked The following section contains the results of the motivated cognitive tasks. Journal of Abnor- Society, 8: 699-720. with two superscript [tt]. Target Partial_G study. mal Psychology , 108: 120-133. Lynne, M. Murphy. 2003. Semantic relations category has partial residual gesture marked via Sebastian J. Crutch, Paul Williams, Gerard R. and the lexicon: antonymy, synonymy and superscript [t] whereas Target Zero_G has no Ridgway and Laura Borgenicht. 2012. The other paradigms. Cambridge, UK: Cambridge gestural leftover. The stimuli in the control are 4 Results role of polarity in antonym and synonym con- University Press. used as the baseline since they don’t have any The main goal of this study is to test listeners’ Margaret A. Niznikiewicz, Michelle Friedman, underlying coronal stop in the coda position of ceptual knowledge: evidence from stroke sensitivity to different degrees of residual Marta E. Shenton, Martina Voglmaier, Paul the first word. Some examples of the target and aphasia and multidimensional ratings of ab- gestures remaining after the simplification of stract words. Neuropsychologia, 50: 2636- G. Nestor, Melissa Frumin, Larry Seidman, control words are given below: consonant clusters. The response type and 2644. John Sutton and Robert W. McCarley. 2004. reaction time are the dependent variables in this Christiane Fellbaum. 1998. (ed.) WordNet: an Processing sentence context in women with study; however, only the results related to electronic lexical database. Cambridge, MA: schizotypal personality disorder: an ERP response type are presented here. Figure 1 below MIT Press. study. Psychophysiology, 41: 367-371. shows the perception rate of [t] by all subjects

170 15 across the four conditions. According to this, the 5 Discussion and Conclusion Each W1 was also paired with a semantically controls), and by the exaggerated priming effect subjects show the highest rate of [ ] perception in unrelated non-antonym target word (W3). Two of patients (close to twice the effect of controls). t This research investigated listeners’ sensitivity to tokens with full alveolar gesture (i.e., 59.69%) lists were created each containing 40 sentences This enhanced semantic priming was not associ- and the lowest for the ones in the control (i.e., three types of /t/ realizations as target and with the same format. The target word was an ated to the clinical state and/or the thought disor- 36.09%). The condition with partial alveolar compared the results with the control. The target antonym in 20 sentences and a semantically un- der of patients. In sum, the patients group encod- gestures shows the rate of 56.20% which is very categories included simplified consonant clusters related, non-antonym word in the other 20 sen- ed contextually relevant target words (Titone et close to the full condition. The stimuli in zero with full, partial, and zero alveolar gestures. The tences. A spacebar press initiated the presenta- al., 2000; Titone et al., 2002) but to a much high- alveolar condition show an intermediate level stimuli used as the baseline in the control had no tion of the definitional sentence fragments as The er degree than controls. Interestingly, this larger between the control and the other two target alveolar gesture in the underlying form. The opposite of word is; a second spacebar press ini- semantic effect occurred under strategically con- conditions with the rate of 49.84%. This shows general results of the study showed that subjects tiated the presentation of the target word that re- trolled conditions rather than under the automatic almost a similar pattern between the two target reliably distinguished the three target sets with mained on the screen until response. Participants condition typical of word priming at short SOAs conditions with full and partial gestures, an varying residual gestures from the control. This pressed a YES button to respond to correct targets (Minzenberg et al., 2002). This suggests a com- intermediate situation for the target condition could be due to more similarity in tongue and a NO button for incorrect targets. promised ability of patients with SZ to engage in with zero gesture, and a pattern for the control configuration in realizing these varying degrees the controlled processing operations necessary to which is different from the three target of coronal stop articulation compared to the 4 Results flexibly use semantic memory representations. conditions. control condition where there is no alveolar At the same time the relatively high level of ac- gesture in the underlying form. Any articulatory Significant group differences emerged in all the curacy of patients (96.6% vs. 98.5% of healthy neuropsychological tests (see Table 1) adminis- modification is expected to trigger acoustic subjects) suggests a preserved semantic storage changes. The acoustic results of the stimuli used tered to patients and controls. The priming scores and access to semantic representations (Titone et The rate of [t] perception for the in this study by Falahati (2013) showed no revealed a statistically significant, enhanced con- al., 2002; Titone et al., 2007). High accuracy mean subject significant difference between the simplified textual priming in patients compared to controls may reflect a ceiling effect as well as the fact that tokens (i.e., the three target sets with varying (16.04% vs. 9.6%). The ANCOVA on response polarity information processing can be less de- 80% degrees of residual gestures labeled all together times showed significant main effects of Group, manding on executive resources than other types 60% as simplified) and control tokens. The acoustic with patients overall slower than controls (Ant.: of semantic relationships (Crutch et al., 2012). 1273 ms; Unrel.: 1645 ms; Ant.: 984 ms; Unrel.: 40% parameters used in the analysis were V1 duration, Consistently with the reported effects of thought consonant clusters duration, and formant 1108 ms, for patients and controls respectively), disorder on semantic processing (for overviews, 20% transitions. Despite of the fact that the results did and of Vocabulary. The ANCOVA on accuracy see Kuperberg, 2010b; Pomarol-Clotet et al., 0% not show any significant difference between (Ant.: 96%; Unrel.: 98%; Ant.: 98%; Unrel.: 2008), patients with higher scores of positive Control Full_G Partial_G Zero_G simplified and control conditions, the duration of 99%; for patients and controls respectively) thought disorder were also less accurate in iden- showed a main effect of Vocabulary. In addition, V1 and consonant clusters in the simplified tifying antonyms. Accuracy instead improved in condition was always higher than the control the accuracy and response times of patients sig- patients scoring higher in both the Vocabulary nificantly correlated with Vocabulary scores Figure 1: The Rate of [t] Perception by all Subjects condition. It could be the case that these acoustic sub-test and the Verbal scale of WAIS-R (these (WAIS-R) in that patients scoring higher in the cues, although not very strong, are enough for patients also had faster response times). These Vocabulary test also were overall faster in re- In order to examine the relation between the two human’s auditory system to trigger the presence results are consistent with prior studies indicating sponding to antonyms and non-antonyms and categorical variables in the study, namely the of a segment. that in SZ high Vocabulary scores are protective response type and stimuli condition, a Pearson The results of the current study also showed more accurate in rejecting non-antonyms. Pa- of semantic deterioration (Brébion et al., 2010) tients scoring higher on the Verbal Scale (WAIS- chi-square test was run. The null hypothesis is that participants perceived almost 36% of the reflecting premorbid intelligence (Lezak et al., tokens with no underlying coronal stop as having R) also had faster response times to antonyms, 2004). On more general grounds, these results that there is no relation in the [t] perception and and patients scoring higher on the Positive Scale provide further evidence of the already docu- the four conditions in the study. The results of [t]. This is very similar to the results of the study (PANSS) a lower accuracy on antonyms. mented association of verbal intelligence to effi- the test with [t] perception as the dependent reported by Nalon (1992) where 20% of the cient language comprehension (Hunt, 1977). variable found significant main effect of control nonalveolar tokens were perceived as 5 Conclusions 2 Overall, our results indicate that the state of re- conditions χ (3, N = 960) = 46.2, p < 0.001. This having [ ]. In his study, however, the control d sidual SZ contributed to slower antonym recog- shows that there is a significant relation between While antonym recognition was fast and accurate tokens showed similar pattern to that of the target nition above and beyond the cognitive deficits the stimuli conditions and response type. In order in heathy controls, the picture emerging for pa- with zero alveolar (i.e., full assimilation). He that characterize SZ patients. In sum, it is not the to determine whether the difference in the tients is more complex. Specifically, the preced- attributes this to both subjects’ natural language case that patients comprehended antonyms as perception of [t] across four categories is really experience as well as the inherent ambiguity in ing definitional fragment facilitated antonym recognition in both patients and healthy controls controls, but simply at a slower pace. In fact, significant or it is due to chance variation, a the stimuli. He states that subjects are “willing to compared to controls, patients not only had long- column proportions test was performed. This test “undo” its effects” and therefore, in the case of but the amount of facilitation indeed differed. In fact patients were helped more than controls by er response times but also enhanced priming uses z-test to make the comparisons. The result the current study, report coronal stops even scores that presumably reflect deficient con- where there is no evidence for them. the previous definitional context, as shown by showed that the perception of [t] in the control trolled semantic processing and overreliance on The results of our study also showed that the larger reduction of response times to anto- was significantly different from the all target nyms than to non-antonyms (on average, patients stored semantic representations. In conclusion, participants perceived more [ ] in the tokens with categories. The next section presents the t were 25.4% faster in responding to antonyms all other things being equal, antonym identifica- discussion and concluding remarks of the study. full and partial alveolar gestures compared to the than to non-antonyms compared to 11.8% of tion requires a preserved ability to appreciate the ones with zero alveolar gestures. The difference difference between maximally similar and maxi-

16 169 any group difference, given the general cognitive 3 Method between the three categories, however, did not their perception of [t]. The variation across deficits of people with SZ. To limit this potential reach the significance level. Such result could individuals regarding speech perception could be confound, we carried out analyses of covariance 3.1 Participants shed more light on the theories of speech a good source of information for the specialists on mean response times and accuracy to partial Participants included 39 Italian chronic outpa- perception discussed earlier in this paper. In in the field. Moreover, the degree to which an out the contribution of covariates (i.e., Verbal tients with paranoid SZ (14 female; mean age 31 order to discuss this issue, first we need to individual’s speech production could map to fluencies, Vocabulary, and Digit Span). Alt- years, age range 20-45, SD 6.2) and 39 healthy further explore the nature of the three categories his/her perception is an interesting topic which hough we did not necessarily expect accuracy to volunteers as control participants (see Table 1 for in the target stimuli. From the three groups in the remains to be explored. be compromised in patients, given their mild-to- a characterization of patients and controls). The target stimuli, one group categorically had no moderate form of SZ, the low demanding nature diagnosis of paranoid SZ is based on the Positive alveolar gesture while the other two had different Acknowledgments of the task and the high familiarity of the stimuli, and Negative Syndrome Scale (PANSS; mean degrees of the gesture either as a result of we expect accuracy to be modulated by the se- score: 46.69, range: 34-68) and was confirmed overlap or reduction. We argue that the gradient We are very grateful to Patrice Beddor for her verity of thought disorder and the clinical state of by the clinical consensus of staff psychiatrists. gestural reduction and overlap are due to low- comments and suggestions on this study. patients, as found in prior studies on semantic level phonetic and mechanical reasons while the Participants gave their informed consent for in- processing in SZ. clusion before they participated in the study (ap- categorical deletion, which results in tokens with proved by the Ethics Committee of Modena). zero gestures, is caused by the cognitive system. In the former groups, speakers neither intend to reduce nor plan to overlap gestures while the

Table 1 latter process is intended by the speaker. Demographic characteristics of the study sample, and clinical characteristics of the schizophrenic pa- According to MT and DR, listeners’ target in speech perception is the intended or phonological tients gestures. Therefore, the overlapped and reduced Patients Controls stimuli should show different perceptual pattern compared to the stimuli with no residual gesture. The results in this study did not show a striking

Mean SD Mean SD p difference between these three target sets. The existence of acoustic cues pertaining to the Sex M=25; F=14 M=25; F=14 presence of gestures is a prerequisite to their perception by the listener. If distinguishing Age (years) 31.41 6.22 31.28 6.31 .93 Education (years) 12.56 1.33 12.51 1.48 .88 acoustic details could be found between these Drug SG=33; FG=2; FSG=4 three categories, then this would not support the gesturalist approach to speech perception. Years of illness 8.97 5.94 WAIS-R (Verbal Scale) 91.05 15.41 However, with the current results, such a claim

WAIS-R (Performance Scale) 86.31 19.42 cannot be made. Further acoustic analysis WAIS-R (Total Score) 87.82 18.31 between these three target sets is needed to examine this idea further. Vocabulary (WAIS-R) 8.23 3.24 10.77 2.38 .0001 Phonemic Fluency 28.51 8.25 37.28 7.68 .0001 The findings in our experiment could be best Semantic Fluency 38.44 8.44 44.10 7.74 .003 explained by referring to exemplar models of speech perception. In such models, the lexical BADA (errors) 1.15 1.18 0.03 .16 .0001 Digit SPAN (Forward) 5.44 .74 5.85 .83 .04 representations of words change in a gradient way over time. This is due to the nature of some Digit SPAN (Backward) 3.75 1.07 4.28 .97 .05 Digit SPAN (Total Score) 9.18 1.51 10.13 1.57 .02 phonological processes in languages which are not categorical. According to this view, the BPRS 2 0 PANSS (Positive Scale) 11.64 3.12 perceptual mechanism is capable to make fine PANSS (Negative Scale) 11.21 4.02 phonetic distinctions. However, it is the mapping PANSS (Gen Psyc Scale) 23.84 3.43 between the gradient stimuli and the auditory PANSS (Total Score) 46.69 8.13 system which fails and does not result in nonvariant forms. M = male; F = female; FG = first-generation ; SG = second-generation antipsychotics; The lack of such a one-to-one mapping will FSG = combination of first- and second–generation antipsychotics. bring variation across subjects in the speech a semantically unrelated word (NICE). Subjects community. The degree of such variation is 3.2 Materials and Procedure had to decide whether or not the target was cor- determined by the amount of individual’s Participants were presented with a definitional rect. We used 40 very familiar antonym word exposure to the specific variants. A closer look at sentence fragment containing the first word of pairs (W1-W2; e.g., black/white, dead/alive; the results for individual subjects showed that all the antonym pair (e.g., The opposite of black is) long/short; optimistic/pessimistic) in which the twenty-four participants in the study could fall followed by the correct antonym (WHITE) or by antonym had a cloze probability value of 0.98. into three or four dominant patterns based on

168 17 Reference John Ohala. 1990. Respiratory activity in speech. In 2012). But, according to some authors, patients disease, rather than necessarily reflecting seman- W. J. Hardcastle & A. Marchal (eds.), Speech would fail in inhibiting contextually-irrelevant tic dysfunction (Niznikiewicz et al., 2010), this Patrice S. Beddor, Kevin B. McGowan, Julie Boland, Production and Speech Modeling, 23-53. Andries W. Coetzee, and Anna Brasher. 2013. information, especially at long SOAs (Minzen- may lead to an artificial increase of the reaction Netherlands: Kluwer Academic Publishers. berg et al., 2002), rather than in encoding contex- time difference with healthy participants. To The perceptual time course of coarticulation. Journal of the Acoustical Society of America, 133, John Ohala. 1995. The perceptual basis of some tually-relevant information. This impairment avoid this confound, often semantic priming 2350-2366. sound patterns. In B. Connell and A. Arvaniti (eds.), would be linked to a more global deficit in con- studies have used a priming score (PRI; Spitzer Phonology and phonetic evidence, Papers in trolled semantic processing (Titone et al., 2000; et al., 1993), rather than the mere response times Pierre Delattre, Alvin M. Liberman, and Franklin S. Laboratory Phonology IV, 87-92. Cambridge: Titone et al., 2002). to the targets. The PRI reflects the amount of Cooper. 1955. Acoustic loci and transitional cues Cambridge University Press. The importance of antonyms for elucidating facilitation of prior context on the response time for consonants. Journal of the Acoustical Society of the organization and retrieval of semantic to a target and is calculated as follows: (RT America, 27, 769-773. unrelated knowledge is documented by the recent resur- targets - RTrelated targets)/ RTunrelated targets)*100 (Spitzer Pierre Delattre, Alvin M. Liberman, and Franklin S. gence of interest on antonyms in normal com- et al., 1993). Here, we compared the individual Cooper. 1964. Formant transition and loci as prehension (e.g., de Weijers et al., 2014; Paradis PRI of patients to those of pairwise matched acoustic correlates of place of articulation in et al., 2009). In contrast, the vast literature on healthy controls. American fricatives. Stud. Linguist. 18, 104-121. semantic processing deficit in SZ has almost ig- Subjects read a definitional sentence fragment Randy L. Diehl, Andrew J. Lotto, and Lorri L. Holt . nored antonyms with the exception of a few pa- (The opposite of word is..) that, upon pressing 2004. Speech perception. Annual Review of per-and-pencil studies of the 1960s (Blumberg the space bar, was followed by the antonym or an psychology. 55, 149-179. and Giller, 1965; Burstein, 1961) that have doc- unrelated control word. This self-paced target Alvin M. Liberman and Ignatius G. Mattingly. 1985. umented impairment of SZ patients on antonyms. verification task is suited to obtain information The motor theory of speech perception revised. This underestimation of antonyms as a relevant on real-time comprehension while placing little Cognition, 21: 1-36. test case of semantic organization can also be demand on the need to maintain and update in- attributed to the fact that most neuropsychologi- Reza Falahati. 2013. Gradient and Categorical formation in working memory. We did not use Consonant Cluster Simplification in Persian: cal studies on conceptual representations have similar, fixed time durations for patients and con- An Ultrasound and Acoustic Study, Ph. D primarily investigated semantic similarity rather trols because SZ patients typically need longer Dissertation, University of Ottawa, Ottawa. than opposition (Crutch et al., 2012) at variance presentation durations than healthy subjects to with the fact that semantic opposition, rather than perceive a stimulus. Carol C. Fowler. 1981. Production and perception of similarity, is thought to be the axis around which Healthy subjects should respond in a fast and coarticulation among stressed and unstressed the adjectival lexicon clusters (Murphy, 2003; accurate way, in line with the literature. Seman- vowels. Journal of Speech and Hearing Research, Paradis and Willners, 2011). tic priming studies often observed an exaggerat- 46, 127-139. ed priming score of patients compared to con- trols (for an overview, see Kuperberg, 2010b; Carol C. Fowler. 1984. Segmentation of coarticulated 2 Aims of the study speech in perception. Perception & Psychophysics, Pomerol-Clotet et al., 2008). This, as we men- 36, 359-368. Shedding light on whether or not antonym identi- tioned, has been mostly attributed to faster than fication is spared in a neurobiological disorder normal and far-reaching spread of activation in Carol C. Fowler. 1996. Listeners do hear sounds, not typically associated to semantic deficit may im- semantic memory. This larger semantic priming tongues. Journal of the Acoustical Society of prove our understanding of the organization of effect has been observed under the ‘automatic’ America, 99, 1730-1741. word storage in the human brain (Jeon et al., condition of word priming at short SOAs (Min- 2009). Our general aim was therefore to expand zenberg et al., 2002). In this study, the priming Keith Johnson. 1997. Speech perception without the knowledge about the cognitive processes un- effect elicited by the definitional sentence frag- speaker normalization: an exemplar model. In K. derlying the recognition of antonyms, and to ment on the target word, if any, would occur un- Johnson and J. W. Mullennix (eds.), Talker der strategically controlled conditions since the variability in speech processing, 145-165. San evaluate whether these processes differed in SZ Diego: Academic Press. and in normal language comprehension. We test- target presentation is self-paced, and the defini- ed whether the semantic dysfunction that often tional sentence fragment strategically guides the Alvin M. Liberman and Ignatius G. Mattingly. 1985. characterizes people with SZ necessarily leads to semantic search toward the item that fulfills the The motor theory of speech perception revised. a loss of the capacity to recognize antonyms antonymy definition. Notwithstanding, if indeed Cognition, 21: 1-36. when antonyms are presented alone, rather than patients are characterized by an abnormal spread of activation, we should obtain larger priming Francis Nalon, 1992. The descriptive role of with homonyms and/or synonyms (Blumberg segments: evidence from assimilation. In G. J. and Giller, 1965; Burstein, 1961), and when they scores in patients than in controls. This result Docherty and R. Ladd (eds.), Papers in Laboratory are tested with a real-time task (for a more de- would contribute to clarify the conditions under Phonology IV, 261-289. Cambridge: Cambridge tailed version of this study, see Cacciari et al., which hyper-priming effect can occur. The easy University Press. 2015). nature of the task, the high written frequency and SZ patients tend to be less accurate and slow- bound lexical couplings of the antonym pairs of er than healthy controls on most cognitive this study can minimize semantic processing de- measures (Harvey, 2010; Vinogradov et al., mands. However, it is unlikely that an even in- 1998). Since response slowing is related to the tact ability to identify antonyms may eliminate

18 167 Is black always the opposite of white? The comprehension of antonyms in schizophrenia and in Words matter more than morphemes: Evidence from masked priming healthy participants with bound-stem stimuli Cristina Cacciari Francesca Pesciarelli Tania Gamberoni Fabio Ferlazzo Univ. of Modena, Univ. of Modena, AUSL, Modena, Sapienza Univ. Rome, Hélène Giraudo Madeleine Voga Italy Italy Italy Italy cacciari.cristina francesca.pesciarelli t.gamberoni fabio.ferlazzo Laboratoire CLLE (Equipe ERSS) University Paul-Valéry Montpellier III @unimore.it @unimore.it @ausl.mo.it @uniroma1.it CNRS & Université Toulouse Jean Jaurès [email protected] [email protected] forth, SZ) and, on more general grounds, for es-

Abstract tablishing the neural and cognitive prerequisites

of word comprehension. Studying the types of se representations within the mental lexicon is a In this study, we tested the online comprehen- semantic relationship that patients with SZ can or Abstract matter of ongoing research. Two hypotheses can sion of antonyms in 39 Italian patients with cannot correctly understand may also yield fur- be drawn: according to the first, morphemic units paranoid schizophrenia and in an equal num- ther insights into the ways in which semantic Five masked priming studies were carried correspond to concrete pieces of words (i.e., ber of pairwise-matched healthy controls. Pa- knowledge is represented in the human brain, in order to shed light on the processing of stems and affixes). Complex words are therefore tients were rather accurate in identifying an- and into the mechanisms underlying its use. bound-stem words (e.g., terr- in terrible). processed through a decomposition mechanism tonyms, but compared to controls, they SZ is a neurobiological disorder associated to Both orthographic (e.g., termite) and un- that strips off the affix in order to isolate the showed longer response times and higher several cognitive deficits that include mild to related (e.g., montagne ‘montain’) condi- stem. The morphemic nature of the remaining priming scores, suggesting an exaggerated severe language comprehension and production tions stand as baselines for controlling letters is then verified by the system and access contextual facilitation. Presumably, this re- abnormalities (at word and sentence levels) as morphological effects. The results of the to word representations (i.e., word forms coded flects a deficient controlled semantic pro- well as attentional and information processing experiments using unrelated word con- in the orthographic lexicon) operates via the pre- cessing and an overreliance on stored seman- impairments (Harvey, 2010; Kuperberg, 2010ab, trols suggest that in the particular case of activation of their constituent morphemes, i.e., tic representations . Kiang, 2010; Levy et al., 2010). The literature bound-stem words, only genuinely de- morphemic representations stand as access units. has shown that language comprehension impair- rived word primes (terrible) produce pos- This mechanism is exemplified by Taft’s model 1 Introduction ment in SZ are not global and generalized but itive effects differing from formal over- (1994), the basic principles of which are fol- selectively involve abnormalities at a word lap effects. Morphological effects are in- lowed by many psycholinguistic studies (e.g., In this study we investigated the recognition of and/or sentence level (Kuperberg, 2010ab). Stud- terpreted as resulting from both “morce- Crepaldi, Rastle & Davis, 2010). Morphemic antonym word pairs in patients with paranoid ies on word processing in SZ have predominant- me” and “base-lexeme” activations. units are situated between the level of let- schizophrenia and in pairwise matched healthy ly used the semantic priming paradigm obtaining ters/syllables and the word level; consequently, participants. mixed results (for overviews, Minzenberg et al., 1 Introduction they can only be matched to concrete letter clus- Conceptual knowledge stored in semantic 2002; Pomarol-Clotet et al., 2008; Pesciarelli et As is broadly admitted, morphologically related ters (i.e., bound-stems, free-stems and affixes) memory includes representations of many differ- al., 2014). Typically, studies have obtained words prime each other in various languages that constitute words. This decompositional ent types of lexico-semantic relationship, among greater than normal semantic priming (hyper- (Arabic: Boudelaa & Marslen-Wilson, 2001; mechanism is also insensitive to any semantic which antonymy. Antonymy is thought to be the priming) at short intervals between the presenta- English: Rastle, Davis, Marslen-Wilson & Tyler, characteristics of words (i.e., transparent vs. most robust of the lexico-semantic relations, rel- tions of prime and target (SOA, stimulus onset 2000; French: Giraudo & Grainger, 2000; Ger- opaque morphological formation) or to their lex- evant to both the mental organization of the lexi- asynchrony) especially, but not only, in thought- man and Dutch: Drews and Zwitserlood, 1995; ical environment (in terms of orthographic con and the organization of coherent discourse disordered patients. Hyper-priming is often ac- Hebrew: Frost, Deutsch & Forster, 1997) thus neighborhood or family size). One of the strong (Fellbaum, 1998; Willners, 2001; Jones, 2002; companied by reduced or absent priming at long suggesting the existence of a morphological level predictions of the decompositional approach is Murphy, 2003; Paradis and Wilners, 2006; van SOAs (more than 300 msec). These distorted of processing. This kind of study has used differ- that morphological priming effects should vary de Weijr et al., 2014). Antonymy is the label ge- priming effects have been interpreted in terms of ent types of materials, words or pseudowords, as following the ease with which constituent mor- nerically used to refer to any of two words that abnormal neural processing of the relationships well as multiple settings: for the masked priming phemes can be identified/extracted. are semantically opposed and incompatible for at between concepts in long-term semantic memory technique (Forster & Forster, 2003), widely used According to the second hypothesis, morphology least one of their senses (e.g., black/white, and of functional abnormalities of semantic to shed light on morphological processes as well is coded at the interface of word and semantic dead/alive). Antonyms are recognized faster than memory neural networks that produce abnormal- as in this study, the distinction can be made be- representations and corresponds rather to lex- any other words or non-words in word recogni- ly fast and/or far-reaching spreading of activation tween designs using only unrelated controls and emes (Aronoff, 1994). Lexeme units are coded at tion, elicit each other in word association tests among concepts (Kiang, 2010). Patients with SZ those using both unrelated and orthograph- the interface of the word and the semantic level, and are often mistaken in speech errors. Anto- would also fail in suppressing or deactivating ic/phonological controls, as suggested by Girau- organizing the lexicon in terms of morphological nyms occur very frequently in written and oral contextually inappropriate semantic associations do & Grainger (2001) or Pastizzo & Feldman families. The recognition of any complex word language, presumably because binary contrast is because of the distorted use of context that char- (2002). triggers first the activation of all word forms that a powerful organizing principle in perception and acterize SZ. This deficit has been attributed to a Even though the existence of a morphological can match with it; a competition is then engaged cognition (Bianchi et al., 2011). In sum, antonym more general deficit in constructing and main- level of processing is unanimously acknowl- between the pre-activated forms (forms matching word pairs represent an important phenomenon taining an internal representation of context for edged, the exact nature, locus and the role of the- the input, i.e., those who are morphologically for elucidating the nature of the semantic dys- control of action (Cohen et al., 1999), due to function that characterizes schizophrenia (hence- working memory deficit (Barch and Ceaser, Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Copyright © by the paper’s authors. Copying permitted for private and academic purposes. Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

166 19

related but also those who are only orthograph- stem. Psycholinguists tested this difference and References based Approaches to Metaphor and Metonymy, 63- ically related) until the right lexical representa- found that processing for free and bound-stems 105. Berlin and New York: Mouton de Gruyter. Leslie A. Baxter. 1992. Root metaphors in accounts of tion reaches its recognition threshold (deter- may differ but both produce significant priming mined by its surface frequency). During this effects (Forster & Azuma, 2000; Järvikivi & developing romantic relationships. Journal of So- cial and Personal Relationships, 9: 253-275. competition phase, morphologically related Niemi, 2002; Marslen-Wilson, Tyler, Waksler & words send positive activation to their respective Older, 1994; Pastizzo & Feldman, 2004). Of Mark Davies. 2013. Corpus of Global Web-Based base lexeme, feeding back activation to them. great importance to our study, Pastizzo and English: 1.9 billion Words from Speakers in 20 Morphological priming effects result from this Feldman (2004) observed that the magnitude of Countries. Available online at mechanism of co-activation. Following this su- facilitation varied following the baseline used in http://corpus.byu.edu/glowbe/. pralexical approach (Giraudo & Grainger, 2000; the experiments: equivalent magnitudes of prim- Javier E. Díaz-Vera and Rosario Caballero. 2013. 2001), complex words are not “decomposed”, ing for free and bound-stems were obtained rela- Exploring the feeling-emotions continuum across but are able to trigger the activation of their con- tive to an unrelated baseline; with an orthograph- cultures: Jealousy in English and in Spanish. Inter- stituent morphemes. In this kind of architecture, ic control however, free-stems produced system- cultural Pragmatics, 10(2): 265-294. lexeme units are supposed to be abstract enough atically greater priming than bound-stems. The Raymond W. Gibbs. 2006. Cognitive linguistics and to tolerate variation induced by derivation and interpretation of this line of research suggests metaphor research: Past successes, skeptical ques- inflection (i.e., allomorphy, suppletion, phono- that morphological priming effects are not direct- tions, future challenges. DELTA 22: 1–20. logical/ morphological truncation, haplology, ly constrained by semantic similarity between Braj B. Kachru. 1988. The spread of English and sa- verb-noun conversion). In other words, a mor- prime and target. The second handicap, in terms cred linguistic cows. In Peter H. Lowenberg (ed.), phological unit does not necessarily need to sur- of surface analysis, consists in the difficulty in Language Spread and Language Policy: Issues, face in the real world in order to be coded in segmenting the word forms into morphemes. At Implications and Case Studies, 207-228. Washing- long-term memory. This organisation, compati- this point, the two different models presented ton, D. C.: Georgetown University Press. ble with recent neuroimaging data (Lévy, Ha- above give rise to different predictions: accord- Zoltan Köveces. 1998. The Language of Love: The goort, Démonet, 2014), also implies that all mor- ing to the morpheme-based approach all complex Semantics of Passion in Conversational English. phemes of a given language are not necessarily forms (free-stem as well as bound-stem words) Toronto: Bucknell University Press. represented within the mental lexicon: units such are first analyzed in morpheme fragments and Zoltan Köveces. 2000. Metaphor and Emotion. Lan- as neologisms, hapaxes and nonce words are not then access word representation, in other words, guage, Culture, and Body in Human Feeling. necessarily directly connected with existing the lexicality of the base doesn’t matter. This Cambridge: Cambridge University Press. morphological units; bound-stem words could be approach predicts morphological priming be- such a case. tween derivations (e.g., virus-viral) as well as Niklas Luhmann, 1996. Liebe als Passion. Zur Codie- between the base and its derivation (e.g., vir- rung vor Intimität. Frankfurt: Suhrkamp. viral). Carolyn Miller and Dawn Shepherd. 2009. Questions 2 The study According to the supralexical approach, the for genre theory from the blogosphere. In Janet members of a morphological family are linked Giltrow and Dieter Stein (eds.), Genres in the The present paper focuses on the processing of together by virtue of their common base at the Internet. Issues in the Theory of Genre. 263-290. bound-stem words by opposition to free-stem lexeme level; however, the base of bound-stem Amsterdam: John Benjamins. words. For ex., on one hand, the word viral com- words is not represented at the word level. In this Michael Novak. 2013. The Myth of Romantic Love posed of the bound-stem vir-, also present in vi- case, priming effects between related derived and other Essays. Piscataway: Transaction. rus, virulent, virulence, virology and virologist words (e.g., virus-viral) are expected but no ef- and, on the other hand the word singer composed Naomi Quinn. 1987. Convergent evidence for a cul- fect should be observed using their bound-stems with the free-stem sing that forms singing, song, tural model of American marriage. In Dorothy Hol- as primes, the access to the base lexeme being etc. Both are defined as being morphologically land and Naomi Quinn (eds.), Cultural Models in conditioned by the prior activation of a word Language and Thought. 173-192. Stanford: Stan- complex but while it is evident for the standard form at the word level. ford University Press. speaker/reader that the complex word singer de- Taft and Kougious (2004) investigated this issue rives from the root sing, it is less evident to say Ulrike A. Schröder. 2009. Preferencial metaphorical in English through a masked priming experi- from which root the complex word viral derives. conceptualizations in everyday discourse about ment. They compared both semantically and or- The morpheme vir-, which does not have any love in the Brazilian and German speech communi- thographically related words (e.g., virus-viral) to clear meaning in English, can be considered as a ties. Metaphor and Symbol 24: 105-120. merely orthographically related words (e.g., fu- bound-stem whereas sing- in singer is a free- Anatol Stefanowitsch. 2004. HAPPINESS in English ture-futile) and, unsurprisingly, found facilitation stem. From a processing point of view, the vir- and German: A metaphorical-pattern analysis. In in the former case but not in the latter. Neverthe- viral example can be viewed as a case where the Kemmer Achard and Susanne Kemmer (eds.), less, the design of this study is not very informa- lexical unit is not directly connected to the mor- Language, Culture, and Mind, 137–149. Stanford: tive with respect to the decomposition issue, giv- phological unit, by virtue of its twofold handi- CSLI. en that the critical condition examining the effect cap: the first aspect is semantic interpretability, Anatol Stefanowitsch. 2006. Words and their metap- of the bound-stem on its derivations has not been i.e., derivations composed with a bound-stem hors: A corpus-based approach. In Anatol Stefa- considered. could be less interpretable than those with a free- nowitsch and Stephan Th. Gries (eds.), Corpus-

20 165

SOURCE GB IN PK NG chines) or to interactive cooperation (as in, for Our study aims to fill this gap through five visual word primes (e.g., terrible) produced significant FLUID/CONTAINER 32 8 45 11 example, economic exchange, hidden object or masked priming experiments with native French priming effects (33ms), though these conditions INSANITY 25 18 21 8 journey). The overall distribution of the 7 con- speakers. In this kind of protocol, subjects are did not significantly differ from the bound-stem NATURAL FORCE 15 15 14 16 ceptual mappings included within this category unaware of the presence of the prime, which al- condition (e.g., terr-) whose effect (18ms) did OPPONENT 14 12 4 3 in each corpus section (Table 3 above) indicates lows minimizing strategy use and examining au- not manage to reach significance. WAR 14 10 4 4 that relationship-related source domains moti- tomatic processing during the early stages of Exp. 4 was designed to see if the advantage for FIRE/LIGHT 10 12 20 12 vate a relative low number of metaphorical ex- word identification: all five experiments use a the complex word sharing the same bound-stem NUTRIENT 7 18 8 15 pressions in the four sections. This is especially within-priming (Latin square) design, in which found in exp. 3 holds up to the comparison with HIGH/RAPTURE 7 9 4 6 true in the case of the PK (27.7%) and the NG we directly compare the effects of different non-word primes constructed with the same HEALING 3 2 3 1 (24.0%) sections, both of which yield a high primes on the same target. A 57ms prime dura- bound-stem and an existing suffix. The three SPORT/GAME 2 6 2 8 number of examples of force-related mappings. tion) was used and the task was lexical decision. priming conditions were the following: the mor- BOND - 8 5 5 Exp. 1 examined morphological effects induced phologically related word sharing its bound-stem DEITY - 19 3 22 SOURCE GB IN PK NG by words sharing the same bound-stem, e.g., ter- with the target, e.g., terrible – terreur; a non- ART/SKILL - 5 3 6 VALUABLE OBJECT 43 36 19 11 rible – terreur ‘terrible-terror’ relative to an or- word made of the same bound-stem and a suffix CAPTIVE ANIMAL - 8 9 8 LIVING ORGANISM 25 9 9 8 thographic control baseline, e.g., termite – different to that of the target, e.g., terrage – WARMTH - - - 4 HIDDEN OBJECT 24 26 13 27 terreur’ ‘termite-terror’ (where ‘termite’ is a terreur (where -age corresponds to an existing MAGIC - - - 5 ECON. EXCHANGE 20 36 34 32 monomorphemic word), as well as an unrelated morpheme); an unrelated control, e.g., montagne AIR - - - 1 UNION OF PARTS 9 8 4 5 baseline (montagne – terreur ‘mountain-terror’). – terreur. The statistical analysis of the results TOTAL 129 159 146 142 JOURNEY 6 11 9 9 Results show that only truly derived word primes revealed that only related word primes (e.g., ter- BUILDING 3 7 5 6 produce facilitation, relative to unrelated (36ms rible) produced significant morphological prim- Table 5: Distribution of force-related source domains TOTAL 130 133 93 98 of facilitation) as well as orthographic controls ing (40ms) relative to the unrelated controls. in four corpus sections. (35ms). However, this first result does not in- Even if the non-word prime condition (e.g., ter- Table 6: Distribution of relationship-related source form us about how derived words constructed rage) led to quicker reaction times compared to domains in four corpus sections. with a bound-stem are processed: are they ana- the unrelated baseline (688 vs 703ms), it didn’t According to the data described in Table 3 and lyzed in terms of stem + affix or are they global- differ significantly from it. More importantly, the in Table 5, the GB section yields the lowest ly processed? Exp. 2 examined the extent to 25ms difference between the word prime condi- number of instances in which love is portrayed as which the facilitation we take as morphological tion and the non-word one is statistically signifi- 5 Conclusion a force (129 instances in all). The largest number could be due to formal overlap: this is done by cant. This suggests that it takes a real word to of examples in this corpus section portray love The findings of my research of love expressions using non-existent orthographic controls, sharing induce morphological priming, independently either as a SUBSTANCE INSIDE THE EXPERIENCER in a variety of world Englishes shows that there all but one letter with the ‘true’ bound-stem, e.g., and above orthographic low-level perceptual in- (32 instances) or as INSANITY (25 instances) and, exist important differences in the conceptualiza- for the target terreur, the first possible prime is fluences, to which the masked priming technique hence, are compatible with views of other emo- tion of love, from the more passional force- the true bound-stem terr- presented in isolation is known to be sensitive. Our results show that tions (such as anger or happiness; Kövecses related expressions to the more rational relation- (e.g., terr - terreur); the second priming condi- the presence of an existing bound-stem in a non- 2000). The other three sections yield not only a ship-related ones. Based on this distinction, I tion is the non-existing bound-stem condition word does not suffice to produce morphological higher frequency rate of force-related metaphors have analyzed the distribution of each set of (orthographic control) tarr- (e.g., tarr – terreur); priming, a finding which contradicts those pub- (IN: 259; PK: 146; NG: 142), but also a more metaphors in four GloWbE sections. Whereas the third condition is an unrelated baseline (e.g., lished by Longtin and Meunier (2005) as we varied articulation in terms of source domains overseas Englishes show a preference for force- montag – terreur). Although only true bound- shall see in the discussion. Experiment 5 exam- within this category. In fact, many of the expres- based mappings, GB English is relatively neutral stems induced significant facilitation relative to ined the extent to which the morphological facili- sions analysed here instantiate the metaphors (as in the general LOVE IS A STATE metaphor). the unrelated baseline (28ms), the non-existing tation found in exp. 4 could be due to formal fac- LOVE IS A DEITY, LOVE IS WARMTH and LOVE IS Further, whereas the idea of romantic love (em- stem condition (e.g., tarr-) exhibited reaction tors: in order to test this, we replaced the mor- MAGIC, all of which are completely absent from phasis on the collaborative relationship between times (RTs) that did not differ significantly from phologically related word primes by non-words the part of the GB section analysed here. two partners, typically Western love ideal; No- those of the true bound-stem condition. This re- constructed with a bound-stem and a final letter

vak 2013) is more frequent in the GB section, the sult highlights the fact, already pointed out by sequence that does not correspond to any suffix 4.3 Relationship-related metaphorical pat- other corpus sections show a greater tendency to Forster (1999), that there is an influence of for- in French. The following three prime conditions terns talk about love as an emotion, accentuating the mal factors in this kind of protocol, as well as the defined the three levels of the prime type factor: moment rather than the future. need to include orthographic controls in the de- a complex non-word formed by a bound-stem This category includes those metaphorical ex- sign. Experiment 3 directly compared the effects and a suffix, e.g., terrage – terreur (where terr- pressions where love is portrayed by speakers as of complex word primes to those of bound-stem and –age correspond to existing morphemes); a a romantic relationship between two individuals, Acknowledgments primes: the targets were the same as in Exp. 1 simplex non-word formed by a bound-stem and a who cooperate with each other in order to reach a and the three levels of the prime type factor were non-existing ending, e.g., terryme – terreur, in This research has been supported by the Re- common goal. These metaphors are frequently the following: a morphologically related suffixed which –yme is not a suffix; finally, an unrelated found in reference to other types of human rela- gional Government of Castilla-La Mancha (“The word sharing the same bound-stem, e.g., terrible non-word, e.g., moitagne – terreur. The statisti- Expression of basic Emotions in English: Dia- tionship (such as friendship), and are normally – terreur ‘terrible-terror’; its bound-stem, e.g., cal analysis of the results revealed that both chronic and Sociolinguistic Variation” PPII- related either to the handling of complex physi- terr - terreur; an unrelated control, e.g., monta- complex and simplex non-word primes produced cal objects (such as plants, buildings or ma- 2014-015-A). gne – terreur. Results showed that only complex shorter RTs than unrelated primes (31 and 27ms

164 21

of effect respectively): both types of prime are Taken together, the results of the experiments literal and non-literal love expressions. Whereas deeply is frequently used in these examples in able to facilitate target recognition and produce using unrelated word controls (exp. 1, 3 and 4) the highest amount of figurative expressions is order to indicate intensity of the emotion. The thus morphological-like facilitation. Neverthe- suggest that in the particular case of bound-stem found in the GB section (43.6%), the lowest notion of change is viewed as motion into (as in less, the fact that the effects produced by com- words, only genuinely derived word primes (ter- number of metaphors corresponds to the PK sec- ‘I am falling in love’) or out of (as in ‘I am fal- plex primes (e.g., terrage) did not differ from rible) produce positive effects differing from tion (34.7%). ling out of love’) this emotional state, conceptu- those produced by simplex non-word primes formal overlap effects. This is true with the ex- alized as a container. Within this group, I have (e.g., terr yme) leads us to reject any interpreta- ception of exp. 3, where the effect of genuinely SECTION LITERAL FIGURATIVE found several expressions where love is concep- tion based on pre-lexical morphological decom- derived word primes did not differ from bound- GB 564 436 tualized as a nest, and lovers are birds in the nest. position. We suggest interpreting this pattern of stem primes (terr-); note however that in this IN 568 432 According to the GB data, there is a strong results on the basis of formal criteria: for real experiment, the bound-stem condition did not PK 653 347 preference among British speakers to use the words it takes a real word to facilitate processing differ from the unrelated condition, while the NG 596 404 noun love in expressions conveying the meta- (exp. 4), but for non-words, given the absence of derived word condition did. This is a demonstra- TOTAL 2,381 1,619 phors LOVE IS A BOUNDED REGION (83 instances) representation in the word-level, morphological- tion of the fact that “nonwords would be always and LOVE IS A CONTAINER (94 occurrences). The like priming does nothing but reflect low-level better form-primes than words, even when Table 2: Distribution of literal and figurative ‘love’ relative frequency of these metaphors is much perceptual similarities, such as between the two masked. The reason is simply because a related expressions in four corpus sections. lower in the other three corpus sections. As can non-words (both complex and simplex, ter- word prime will compete more vigorously with be seen in Table 4, only in the NG section we rage/terryme) and the target terreur. Besides the the target than a related nonword prime” (For- However, as can be seen in Table 3, major dif- find a similar relative frequency of the metaphor ferences arise if we compare the relative fre- role attri buted to formal factors, the point that ster, 1999: 8). These results are not in accordance LOVE IS A CONTAINER. should be stressed in the interpretation of exp. 4 with those found by Longtin and Meunier (2005) quencies of the three broad categories of source domain described above (i.e. space, force and and 5 is that while in exp. 4 the nonword made using roughly the same priming conditions. In SECTION REGION CONTAINER TOTAL relationship). In spite of the very similar total up from an existing bound-stem and an existing their study, derived non-word primes (e.g., GB 83 94 177 number of instances of each category, the geo- suffix (terrage) seems to interfere with pro- garagité) systematically produced significant IN 65 75 140 graphical distribution of these occurrences cessing of the target (terreur) by virtue of its priming effects on target recognition relative to PK 42 66 108 clearly points towards a preference for force- morphological structure, in exp. 5 this interfer- unrelated word controls, while non- NG 68 96 164 related source domains in the PK (42.0%) and in ence disappears. The fact that, in the ‘terrage’ morphological non-word primes (e.g., rapiduit) TOTAL 258 331 589 condition (exp. 4) we observe RTs that are not yielded a 29 ms non-significant effect. Two fac- the IN (37.0%) sections, in clear contrast with significantly quicker than the unrelated condi- tors can explain these contradictory results: a) the neat preference for space-related source do- Table 4: Distribution of space-related source domains tion, despite the existence of a formal overlap the type of unrelated controls: contrary to Long- mains in GB and NG (41.0%). in four corpus sections. combined with morphological-like structure tin and Meunier, we examined priming effects

(terr-age/terr-eur), can only be due to some kind relative to unrelated non-word primes when the SECTION SPACE FORCE RELATION GB 177 129 130 of interference, otherwise we should observe at prime conditions included non-words and word 4.2 Force-related metaphorical patterns least a small formal effect. This interference nev- primes when the prime conditions included IN 140 159 133 ertheless disappears in exp. 5, since both types of words; b) the type of word targets: given that our PK 108 146 93 Force-related metaphors are frequently used by non-words (with existing suffix, e.g., terrage, as study focuses on bound-stem words, our targets NG 164 142 98 English speakers in order to express their emo- well as well as non-existing suffix, i.e., simplex are mandatorily complex words, and not bare- TOTAL 589 576 454 tions. According to this view, love can be con- non-words such as terryme) lead to significant bases, as in the Longtin & Meunier study. Bare- ceptualized as a NATURAL/PHYSICAL FORCE, as Table 3: Distribution of space-, force- and relation- an OPPONENT IN A STRUGGLE, or as FIRE/LIGHT, facilitation. We therefore obtain a different pat- bases are by definition more frequent, and, sub- ship-related source domains in four corpus sections. tern of priming for words (exp. 4) and for non- sequently, easier to activate because of their low- among others. Broadly speaking, these concep- tual mappings indicate that the person in love is words (exp. 5) which leads us towards an ap- er activation threshold (due to their residual acti- Furthermore, according to the data presented passively affected by a force (either external or, proach where lexicality of the prime does matter vation; for a discussion on this point based on above, whereas relationship-related source do- less frequently, internal), which produces either in the overall pattern of results. Even if the pro- McClelland & Rumelhart 1981, see Voga & Gi- mains occupy a secondary position in the four resistance or loss of control (or both). Preference cessing system can take advantage of ortho- raudo, 2009; Giraudo & Voga, 2014). corpus sections, their relative frequency is espe- for these metaphorical expressions points to- graphic similarities between prime and target cially low in the PK (27.0%) and in the NG wards a stronger presence of the passionate ideal (and will not prevent itself from doing so, as exp. (24.0%) sections. of love that characterizes the earliest stages of 2 showed) this does not tell the whole story, and 3 Discussion: On the representation of it certainly does not tell a morphological story: it bound-stem words the relationship (Luhmann 1996; Schröder 2009: is just another demonstration of a fact that re- 4.1 Space-related metaphorical patterns 105). searchers working with masked priming are fa- On the basis of the above results, we can con- Within this group, I have analyzed the distri- clude that recognition of complex words benefits Space-related metaphorical patterns represent the bution of 17 love metaphors in the four corpus miliar with, namely that this technique is sensi- most general and neutral option as regards the tive to formal factors (Forster, Mohan & Hector, from two springs of facilitation: a bottom-up ex- sections. The results of this part of the analysis citation from a sublexical level and a top-down expression of states and emotions. According to can be seen in Table 5. 2003). The experiments presented here provide these EVENT STRUCTURE metaphors, states in evidence that we can use this valuable technique facilitation from a supralexical level. The idea of a double representation for morphology was re- general are conceptualized as physical locations in order to shed light on truly morphological ef- or bounded regions in space. Speakers use sen- fects, as opposed to morphological-like effects. cently expressed by Diependaele, Sandra & Grainger (2005), suggesting that the morphologi- tences such as ‘I am in love’ to indicate, in a very neutral way, their emotional state. The adverb

22 163

love. I am especially interested in determining • Space-related source domains: The first cal level should be situated both above and be- Crepaldi, D., Rastle, K., Davis, C. 2010. Morphemes in whether, and to what extent, these extra- low the word-form level. Subsequently, morpho- their place: Evidence for position-specific identification category includes very general spatial of suffixes. Memory & Cognition, 38:312-321. linguistic factors can account for the conceptual metaphors, such as LOVE IS A BOUNDED logical representations would be either defined as Drews, E. & Zwitserlood, P. (1995). Morphological and differences illustrated in my quantitative analysis REGION and LOVE IS A CONTAINER. morphologically constrained orthographic repre- orthographic similarity in visual word recognition. Jour- of love expressions. • Force-related source domains: The sec- sentations (depending on frequencies) or as mor- nal of Experimental Psychology: Human Perception & In order to identify the metaphors for love ond category includes most of the source phologically constrained semantic representa- Performance, 21:1098-1116. tions (coded in terms of regularities in the map- Forster, K. I. (1999). The microgenesis of priming effects in used in the corpus, I have adopted the meta- domains typically used in the conceptu- lexical access. Brain and Language, 68:5–15. phorical pattern analysis (MPA) as proposed by alization of emotions in English, such as ping of word forms onto semantics). In the same Forster, K.I. and Azuma, T. 2000 Masked priming for pre- Stefanowitsch (2004, 2006). This method, which EMOTION IS A NATURAL FORCE, EMO- line, Crepaldi et al. (2010) proposed an extension fixed words with bound-stems: Does submit prime per- takes the target domains of the figurative expres- TIONS IS INSANITY or EMOTION IS FIRE. of Taft’s (1994) sublexical model integrating a mit? Language and Cognitive Processes, 15(4-5):539- sions as the starting-point of the analysis, con- lemma level comprised between an orthographic 561. • Relationship-related source domains: Forster, K.I., & Forster, J.C. (2003). DMDX: A Windows lexicon and the semantic system. However, these sists in choosing one or more lexical ítems refer- The third category includes a set of spe- display program with millisecond accuracy. Behavioral ring to the target domain under scrutiny and ex- cific source domains for human relation- two models consider the two morphological lev- Research Methods: Instruments & Computers, 35:116- tracting a significative sample of their occur- ships in English, such as HUMAN RELA- els equivalent, given that they both contain units 124. Frost, R., Deutch, A., & Forster, K.I. 2000. Decomposing rences in the corpus. To start with, I have located TIONSHIP IS A PLANT, HUMAN RELA- corresponding to concrete morphemes. One may nevertheless assume that different locations im- morphologically complex words in a nonlinear morphol- all the instances of the noun love in the four cor- TIONSHIP IS A JOURNEY or HUMAN RE- ogy. Journal of Experimental Psychology: Learning, pus sections (GB, IN, PK and NG). As can be LATIONSHIP IS ECONOMIC EXCHANGE. ply different contents: the hybrid model we pro- Memory and Cognition, 26:751-765. seen in Table 1, the absolute and relative distri- pose (Giraudo & Voga, 2014) is based exactly on Giraudo, H. & Grainger, J. 2000. Prime word frequency in butions of this noun are highly irregular. For ex- Based on the above classification of specific this assumption. Within this model, morphologi- masked morphological and orthographic priming. Lan- ample, whereas only the GB section of the cor- cal complex words are coded according to two guage and Cognitive Processes, 15:421-444. source domains, I will assume here that speakers Giraudo, H., & Grainger, J. 2001. Priming complex words: pus scores a per mil frequency for this noun be- from different parts of the English-speaking dimensions, their surface form and their internal Evidence for supralexical representation of morphology. low the general GloWbE corpus average (217.98 world construe love via conceptual metaphor in structure. The first level captures the statistical Psychonomic Bulletin & Review, 8(1):127-131. ‰), the IN and the NG sections show much different ways. Through the quantitative and regularities of morphemes translated in terms of Giraudo, H. & Voga, M. (2014). Measuring Morphology: the tip of the iceberg ? A retrospective on 10 years of higher frequency rates. qualitative analysis of the set of figurative love perceptual saliency in the language. At this level, morphologically complex and pseudo-derived morphological processing. Carnets de Grammaire, expressions collected in the GloWbE corpus, it is 22:136-167. SECTION FREQ PER MIL possible to determine the speakers’ relative pref- words as well as non-words whose surface struc- Järvikivi, J., & Niemi, J. 2002. Form-based representation GB 69392 179.02 erences to talk about love as a state, as an emo- ture can be divided into distinct morphemes, are in the mental lexicon: Priming (with) bound stem allo- IN 26355 273.30 tion or as a relationship. Through the compara- equally processed. This level is not a morpholog- morphs in Finnish. Brain & Language, 81:412-423. Lévy, J., Hagoort, P., & Démonet, J.F. 2014. A neuronal PK ical level but rather a sub-orthographic level con- 13114 255.30 tive analysis of the figurative expressions used in gamma oscillatory signature during morphological unifi- NG 12179 285.58 the four corpus sections under scrutiny, I will try taining “morcemes”. The second level, i.e., the cation in the left occipitotemporal junction. Human GloWbE 410815 217.98 to illustrate how these conceptual preferences morphological level is paradigmatically oriented, Brain Mapping, 35(12):5847-60. might be embedded in different cultural back- it deals with the construction of words according Longtin, M.-C. & Meunier, F. 2005. Morphological decom- position in early visual word processing, Journal of Table 1: Absolute and relative frequencies of the noun grounds. The results from each corpus section to morphological rules (Booij, 2005; Corbin, ‘love’ in four corpus sections. 1987/1991); it contains “base-lexemes”, units Memory and Language, 53(1), 26-41. are discussed in turn in the following sections. McClelland, J. L., D.E. Rumelhart. 1981. An interactive abstract enough to tolerate orthographic and In ord er to be able to compare the four corpus activation model of context effects in letter perception: phonological variations produced by derivation Part 1. An account of basic findings. Psychological Re- sections with each other, I have selected and ana- 4 Findings and discussion and inflection processes and connected to their view, 88: 375-407. lyzed only a random sample of 1,000 love ex- related word forms on the basis of semantic Marslen-Wilson, W. D., Tyler, L. K., Waksler, R., & Older, L. 1994. Morphology and meaning in the English mental pressions in each sub-corpus (4,000 expressions As indicated above, the data used for this analy- transparency. in all). After collecting 1,000 instances incorpo- lexicon. Psychological Review, 101:3-33. sis has been collected using the GloWbE. The Pastizzo, M. J. and Feldman, L. B. 2004. Morphological rating the key term love in each corpus section, I texts included in this corpus illustrate the genre References processing: A comparison between free and bound-stem extracted the expressions where the emotion was ‘personal blog’; furthermore, as indicated above, facilitation. Brain & Language, 90:31-39. discussed in metaphoric terms, and sorted them these texts where compiled during a relatively Aronoff, M. (1994). Morphology by itself. Cambridge: MIT Rastle, K., Davis, M. H., Marslen-Wilson, W. D., & Tyler, Press. according to the general source domains motivat- short period of time (December 2012). Conse- L. K. 2000. Morphological and semantic effects in visual Boudelaa, S., & Marslen-Wilson, W. D. 2001. Morphologi- word recognition: A time-course study. Language and ing the figurative expression (e.g., NUTRIENT, quently, they are highly homogeneous not only cal units in the Arabic mental lexicon. Cognition, 81:65- Cognitive Processes, 15:507-537. JOURNEY, UNITY OF PARTS, FIRE, etc.). These in terms of their genre, but also in terms of their 92. Taft, M. 1994. Interactive activation as a framework for were then further tagged paying attention to the date of production. Booij, Geert. 2005. Compounding and derivation: evidence understanding morphological processing. Language & more specific source and target domains in- for Construction Morphology. In W.U. Dressler, D. Cognitive Processes, 9:271-294. As described above, in the first stage of this Kastovsky and F. Rainer (eds) Demarcation in Morphol- volved in the metaphors (e.g., LOVE IS MADNESS Taft, M., & Kougious, P. 2004. The processing of mor- research I have located all the instances of the ogy. Amsterdam / Philadelphia: John Benjamins. 109- pheme-like units in monomorphemic words. Brain & within the more general metaphor LOVE IS IN- noun love in four corpus sections (GB, IN, PK 132. Language, 90:9-16. SANITY scenario). Thereafter, the resulting con- and NG). Thereafter, I have classified these ex- Corbin, D. (1987/1991). Morphologie derivationnelle et Voga, M. & H. Giraudo. 2009. Pseudo-family size influ- ceptual metaphors were further classified into pressions into two large groups: literal and figu- structuration du lexique, vol. 2. Tubingen/Villeneuve d ences processing of French inflections: evidence in favor three broad classes on the basis of their source- ’Ascq: Max Niemeyer Verlag / Presses Universitaires of a supralexical account in F. Montermini, G. Boyé, J. rative expressions. According to this part of my de Lille. domain orientation (Kövecses, 2000: 110): Tseng (eds), Selected Proceedings of the 6th Décem- analysis (see Table 2), the four corpus sections brettes: Morphology in Bordeaux. Somerville, MA: Cas- analyzed here show relatively similar rates of cadilla Proceedings Project, 148-155.

162 23

Phonotactic probabilities in Italian simplex and complex words: a Love in the time of the corpora. fragment priming study Preferential conceptualizations of love in world Englishes

Javier E. Díaz-Vera

Universidad de Castilla-La Mancha Giulia Bracco Basilio Calderone Chiara Celata Departamento de Filología Moderna Università di Salerno CNRS & Université de Toulouse II Scuola Normale Superiore 13071 Ciudad Real, Spain Via Giovanni Paolo II 132 5 allées Antonio Machado P.zza dei Cavalieri 7 Fisciano (SA) Toulouse Pisa [email protected]

[email protected] basilio.calderone [email protected] Davies, 2013), I will demonstrate here that the @univ.tlse2.fr 1 Introduction varieties of world English under scrutiny show significant differences in the conventional use of According to Gibbs (2006) “there is still insuffi- figurative expressions. Thereafter, these findings cient attention paid to the exact ways that cul- will be related to the cultural background of each tural beliefs shape both people’s understandings speech community. results of the study on simplex words only; we of their embodied experiences and the conceptual 1 Introduction however discuss the implications of the current metaphors which arise from these experiences.” 2 Research questions findings for the processing of complex words. For example, the conceptual metaphor EMOTIONS Phonotactics refers to the sequential organization ARE FLUIDS WITHIN THE BODY seems to underlie Through the fine-grained analysis of the data of phonological units that are legal in a language 2 Experiment a wide variety of metaphorical expressions used described below, in this paper I will address the (Crystal 1992). However, legal sound sequences by speakers from different linguistic and cultural following research questions: (a) How do speak- do not all occur with the same probability in a 2.1 Materials and procedure areas all around the world. The geographical dis- ers from different parts of the English-speaking language . Phonotactic probability is most often Forty-two native Italian speakers participated in tribution of these metaphorical expressions is so world conceptualize love? (b) What do these measured in terms of transitional probabilities a speeded lexical decision task in a fragment general that numerous researchers have pro- conceptual preferences tell us about these Eng- (TPs) of biphones and has been shown to influ- priming paradigm. Thirty bi- or tri-syllabic Ital- claimed their universal character, in so far as lish varieties from a sociolinguistic perspective? ence a large range of processes, including in- ian nouns containing a biphonemic consonant they are based on our common, embodied ex- (c) To what extent can social and cultural factors fants’ discrimination of native language sounds, cluster in internal position (e.g. borsa, ‘bag’) perience (Kövecses, 2000). However, the appar- account for these processes of conceptual varia- adults’ ratings of the wordlikeness of nonwords served as targets. Each target was primed by a ent ubiquity of this metaphorical mapping in tion? (Vitevitch et al. 1997), speech segmentation (Pitt sequence corresponding to an initial fragment of contemporary emotional expressions does not & McQueen 1998, Mattys & Jusczyk 2001), the target (e.g. bor-borsa). The fragment prime necessarily imply that speakers from different 3 Methodology word acquisition (Storkel 2001) and recognition could consist of 3 o 4 phonemes and always end- linguistic or dialectal areas understand (or, of (Luce & Large 2001). Specifically, in the domain As indicated above, the data used for this analy- ed with the first consonant of the cluster. The course, experience) emotions in the same identi- of word recognition, high TPs facilitate word and sis has been collected using the GloWbE, which average length ratio between prime and target cal way (Díaz-Vera and Caballero, 2013). nonword identification in speeded same-different contains 1,9 billion words. This corpus is illus- was 0.49. The clusters were different across In this paper, I deal with the analysis of con- matching tasks, but slow down identification in trative of the different ways English is used by words and each cluster could occur in only one ceptual variation in the metaphorical construc- lexical decision tasks due to the inhibitory effects speakers living in 20 different countries. The target (although more than one fragment could tion of love in a group of dialectal varieties of of a larg e neighborhood (e.g. Vitevitch & Luce texts included in this corpus represent the genre end in a given consonant). 12 were heterosyllabic contemporary English. Differently to earlier 1999, Luce & Large 2001). Most of the studies ‘personal blog’ (Miller and Shepherd, 2009); (e.g. bor-sa ‘bag’), 12 tautosyllabic (e.g. deg- studies of love metaphors in English (Quinn on the role of TPs in speech production and per- these texts come from 1,8 million web-pages rado ‘decay’) and 6 ambisyllabic clusters (e.g. 1987; Baxter, 1992; Kövecses, 1998), my main ception have been conducted on English. compiled in December 2012 using a highly dis-tanza ‘distance’). aim here is to analyze the socio-cultural dynam- In this paper we focus on the role of phonotac- automated production process. Another set of 30 Italian nouns matching for ics of conceptual metaphor through the recon- tic probabilities in priming morphologically sim- The present study is limited to the analysis of average length, frequency and prime/target struction of the preferential conceptualizations of plex and complex words in Italian. We investi- data extracted from four different national sec- length ratio, in which the fragment prime ended love by speakers of a series of dialectal varieties gate whether biphone TPs affect the recognition tions within the GloWbE, illustrating two very in a syllable onset consonant followed by a vow- of the same language, as spoken in culturally of word targets after exposure to fragment different sociolinguistic contexts: the inner circle el (e.g. tuc-tucano ‘toucan’). The same propor- diverse regions. Through the analysis of the primes differing in the probability with which the (i.e. countries where English is the primary lan- tion of fragment-final consonants was main- socio-cultural dynamics of conceptual metaphor, fragment-final consonant predicts the consecu- guage) and the outer circle (i.e. countries where tained in the two sets of words. I intend to contribute to the field of Cognitive tive segment in the target. English plays an important ‘second language’ Sixty pseudowords matching for average Dialectology by addressing the question whether We opted for a non-factorial, regression de- role in a multilingual setting; Kachru, 1988). The length and properties of the fragment were add- cultural and conceptual differences can be de- sign including lexical and sub-lexical frequency four sub-corpora under scrutiny here are UK (in- ed. Pseudowords were obtained by changing one tected language-internally, not just across lan- and distributional variables as predictors (see ner circle), India, Pakistan and Nigeria (outer letter of existing words (belonging to the same guages. Baayen 2010). In this paper, we report on the circle). In doing so, I will try to describe the dif- frequency range of the experimental words), for Based on textual data extracted from the Cor- ferent ways speakers from radically different cul- pus of Global Web-Based English (GloWbE; tural, social and religious regions conceptualize Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

24 161 whereas Turkish verbs with extended Åke Viberg 1984. The verbs of perception: A 1/3 in their initial part, 1/3 in their central part C of the cluster (‘SequenceTypeFreq’), (xi) the meanings mostly appear in construction such typological study. In: Brian Butterworth, and 1/3 in their final part. The 30 clusters used cumulated frequency of the words in (x) (‘Se- Bernard Comrie, and Osten Dahl (Eds.) as [Ntat – V] in which verbs within a for pseudowords did not appear in the words’ quenceTokenFreq’). Explanations for language universals. Berlin: list. construction often refer to concrete domains Mouton de Gruyter. 123-62. In the lexical decision task, participants were 2.3 Analysis and results based in human experience, like for example asked to press a button corresponding to their Fixed and mixed models with subject and prime motion. dominant hand as soon as the orthographically as random variables were used. The aim of this paper is: a) to provide an presented target was judged as a word, and a dif- For the purposes of the present study, we test- exhaustive description of the structure of the ferent button for targets judged as nonwords. All ed two different models, both including frequen- taste vocabulary related to the roots kuš/kus in the stimuli appeared in Courier New font, 18 cy variables and phonotactic probability varia- Croatian and tat in Turkish, b) to point to point size in the center of the computer screen. In bles; they are shown in Table 1. The two models order to avoid allographic effects, primes were some similarities and differences in the differed for the presence, in model II, of a meas- displayed in uppercase and targets in lowercase. ure of prime frequency, which was not included conceptual extensions of the concept ‘taste’ in The fixation was 200 ms, followed by a 50 ms in model I, and for being focused either on se- the two languages and thus in the organization pause. Primes appeared for 150 ms, followed by quence and bigram token frequencies (model I), of their vocabularies, c) to implement the MP a 50 ms pause. The targets remained on the com- or on sequence and bigram type frequencies. model in the description of lexical structures puter screen for a maximum of 1 sec. If the par- Both models were tested for CC items (e.g. bor- of non IE languages, and thus demonstrate its ticipants did not produce any answer within that sa, ‘bag’) and CV items (e.g. tuc-ano ‘ toucan’) applicability in the lexical analysis of time, the feedback Fuori tempo (‘Out of time’) separately. appeared on the screen. Reaction times (RTs) typologically different languages, pointing to and the number of errors (Nerr) constituted the Model I Model II regular and specific lexicalization patterns in dependent variables. The reaction times were Fixed TargetFreq TargetFreq the two languages. measured from target onset to subject’s response, effects LenghRatio PrimeTokenFreq and responses given after the deadline were SequenceTokenFreq LengthRatio References: BigramTokenFreq SequenceTypeFreq scored as errors. SequenceTP BigramTypeFreq The Experiment was preceded by a practice BigramTP SequenceTP Geert Booij, 2010. Construction Morphology. session. When the participants reached the 70 % BigramTP Oxford: Oxford University Press. of valid responses the experiment started. Random Subject Subject effects Fragment prime Fragment prime Nicholas Evans and David Wilkins. 2000. In the 2.2 Experimental variables Mind's Ear: The Semantic Extension of Table 1. Fixed and random effects for the CC and CV Perception Verbs in Australian Languages. Several statistical and distributional properties of items. Language 76/(3): 546-592. word primes, targets and clusters were derived from the CoLFIS corpus (Bertinetto et al., 2005). The results of the fixed effects analyses for the Adele E. Goldberg,. 1995. Constructions. A For each prime-target pair, we calculated (i) relevant models are summarized in Table 2 (de- construction Grammar Approach to Argument the token frequency of the target (‘TargetFreq’), pendent variable: RTs) and Table 3 (dependent Structure. Chicago and London: The University (ii) the N of words beginning with the prime variable: Nerr). of Chicago Press. fragment (‘PrimeTypeFreq’), (iii) the cumulated According to model I, with RTs as the de- frequency of the words in (ii) (‘PrimeToken- pendent variable, the sequence’s TP (i.e., the TP Ronald W. Langacker, 1987. Foundations of Freq’), (iii) the length of the target (in N graph- Cognitive Grammar: Volume 1, Theoretical between the fragment prime and the second con- emes), (iv) the length of the prime (in N graph- Prerequisites . Standford: Standford University sonant of cluster) turned out to be the most sig- Press. emes), (v) the prime/target length ratio. nificant predictor, even outranking the contribu- For each cluster, we calculated (vi) the TP tion of frequency values (for the target, the se- Ida Raffaelli and Barbara Kerovec. 2008. value, i.e. the probability with which the first quence and the bigram), which all concurred to Morphosemantic fields in the analysis of consonant of the cluster predicts the occurrence the intercept. A different picture emerged how- Croatian vocabulary. Jezikoslovlje (9.1-2): 141- of the following consonant, calculated over the ever for the CV items, for which no probability 169. corpus word tokens (‘BigramTP’), (vii) the N of variables turned out to significantly predict the words containing the cluster (‘BigramType- subjects’ response times; on the contrary, the Ida Raffaelli. 2013. The model of morphosemantic Freq’), (viii) the cumulated frequency of the target frequency, with the secondary contribution patterns in the description of lexical words in (vii) (‘BigramTokenFreq’), (ix) the TP architecture. Linuge e linguaggio 1: 47-72. of the frequency of the cluster, appeared to play a between the fragment prime and the second con- role for this subset of items. sonant of the cluster, e.g. P(s|bor) in borsa ‘bag’ Eve Sweetser. 1990. From Etymology to According to model II, for CC items the role Pragmatics. Metaphorical and Cultural (‘SequenceTP’), (x) the N of words containing of the target frequency turned out to be very im- Aspects of Semantics. Cambridge: Cambridge the sequence of the prime followed by the second portant, and the only additional effect was gener- University Press.

160 25 ated by the sequence’s TP. Thus the two models comparative study of Croatian and Turkish tadını görmek “to taste/experience life”, (lit. were similar in emphasizing the role of the prob- taste vocabulary shows that there are some “to see the taste of life”). Turkish verbs do not ability with which a given C follows the prime other abstract domains conceptualized by the extend their meanings to all abstract domains sequence . As for CV items, model II returned a domain of taste. Such domains are for Croatian prefixed verbs do: they do not share picture very similar to the one that emerged in model I, with target frequency and bigram type example ‘ambience’, ‘mood’, ‘atmosphere’, meanings with Croatian verbs pokušati “to try; frequency as the only significant predictors. ‘charm’, ‘enchantment’, that are all to attempt”, okušati se, okušati se u, okušati se conceptualized by the domain of taste in kao “to try (out) (as)”, nor can they be related Turkish, but not in Croatian, as showed in to the abstract domain of temptation (as with some examples below. Croatian iskušavati ”to tempt; to test”, In Croatian the root kus/kuš is a basis of iskušenje ‘temptation’, kušnja ‘temptation; the verb kušati “to taste” that, by the process crucible”). Similarly, Turkish root tat cannot Table 3. Fixed effects coefficients for the two models, of prefixation, enabled formation of various relate to the domain of aesthetic judgement CC and CV items (Nerr=dependent variable). verbs and constructions such as pokušati “to (Croatian ukus), but when morphologically 3 Discussion try; to attempt”, iskušati/iskušavati/iskusiti “to extended by suffixes –li “with” or –siz try; to experience”, prokušati se “to try; to try “without”, it extends to some domains This work aimed to shed light on the role of TPs out”, okušati se (u/kao) “to try (out) (as)” that Croatian root does not: tatlı (lit. „with taste“) in a so far unstudied experimental environment, i.e., a lexical decision task with fragment prim- differ with respect to prefixes (and does not mean “tasty”, but “sweet”. ing. As the large part of studies on phonotactic prepositions) and thus with respect to their Accordingly, tatlı relates to a variety of probabilities focused on English, this work also usage and meanings. The perfective verb pleasant experiences (feelings, climate, added to the field with evidence from a poorly okusiti “to taste” differs from the verb kušati activities), while tatsız means “untasty”, but Table 2. Fixed effects coefficients for the two models, investigated language, Italian. primarily in aspect, however all the others also “unpleasant”, “irritating”, “disturbing”, CC and CV items (RTs=dependent variable). Fragment priming is known to be modulated verbs cannot be used in relation with tasting “annoying” etc. In addition, Croatian root not only by word frequency and the frequencies food. They exclusively have abstract kus/kuš cannot be used to express “enjoying” When subject and prime were included as ran- of words matching the fragment but also by top- dom factors, the pairwise comparison in the like- down information conveyed by the prime: a meanings like nouns kušnja and iskušenje as Turkish root tat can (e.g. tatilin tadını lihood ratio test confirmed that the contribution fragment prime matching a unique morpho- “temptation”. Croatian is somehow specific çıkarmak “to enjoy holidays”, lit. “to extract of the sequence’ s TP increased significantly the lexical family is as effective as a stem prime, with respect to the existence of two the taste of holidays”). As far as contextual 2 predictability of the RTs patterns: χ (1)= 11.184, thus showing that priming acts as a cue for the 2 morphologically closely related nouns: okus realizations are concerned, one of the most p= 0.0008 in model I, χ (1)= 5.4403, p= 0.019 in properties displayed in the target (see e.g. Lau- “taste” and ukus “system of aesthetic prominent differences between Croatian and model II. danna & Bracco, 2006, for Italian). judgement”, differing significantly according Turkish is that Turkish root tat, besides verbs The average reaction times and the number of This study has shown that the priming effect to their semantic structures. A distinction in for visual perception, combines with verbs errors were positively and significantly correlat- when an initial fragment is available is influ- ed, though with an intermediate correlation coef- enced also by bottom-up variables; in particular, usage and meanings of the two nouns will be expressing motion (Paris’in tadına varmak ficient (r = .648, p < .01). We thus tested the two it depends on the probability with which the analyzed and some specificities will be “to experience the spirit/charm of Paris“, lit. models with Nerr as the dependent variable, in segments composing the fragment or the frag- pointed out. “to come to the taste of Paris”), taking (tadını order to determine if the error rate was influ- ment-final consonant predict the occurrence of Morphosemantic field of the Turkish root almak “to taste”, “to experience”, “to enjoy”, enced by frequencies and probabilities to a dif- the consecutive consonant. Although to a lesser tat exhibits some similarities and some lit. “to take the taste of”; tadını çıkarmak “to ferent extent than response latencies. extent, the frequency with which bigrams and 2 differences in comparison to the enjoy”, lit. “to extract the taste of”), and With Nerr as the dependent variable, R values sequences occur (as types or tokens) in the lexi- were consistently lower than in the RTs simula- con also predict the subjects’ behavior. Phono- morphosemantic field of the Croatian root cognitive activity (tadını bilmek “to tions (Table 3), thus indicating that the error pat- tactic probabilities thus turned out to predict the kuš/kus. Tat “taste” is a noun used as a basis experience”, lit. “to learn/to know the taste terns were accounted for by our frequency and subjects’ response to a large degree for many of in the formation of the verb tatmak “to taste” of”; tadını tanımak “to experience”, lit. “to get probability variables to a more limited extent. In the phonological environments tested in the cur- and of the phrasal verbs tadını görmek “to to know the taste of”), which is not the case in particular, both model I and model II emphasized rent experiment, sometimes outperforming target taste” (lit. “to see the taste of”) and tadına Croatian. Combining nouns and verbs derived for the CC items the role of target frequency as frequencies, and consistently overtaking the con- bakmak “to taste” (lit. “to look at the taste from the same root is also characteristic for the only significant predictor of errors, while for tribution of the prime/target length ratio and of of”). This means that, unlike in Croatian, Turkish but not for Croatian (tadını tatmak “to CV items an additional role of bigram frequen- the prime frequency. cies (by token and by type, respectively) was The results however suggested that the phono- verbs for visual perception are used for taste the taste of”). found. Thus for the CV items, RTs and error rate tactic probabilities in the case of consonant clus- lexicalization of taste experience and taste Thus, it could be claimed that Croatian produced consistent results. ters were overall more important than in the case activity. Similarly to Croatian, all three verbs verbs with extended abstract meanings are of consonant-vowel sequences; thus it must be relate to the domain of food as well as to the mostly realized in constructions such as [pref abstract domain of experience (e.g. hayat – Vkus/kuš – prep] as okušati se u “to try out”,

26 159 'Taste' and its conceptual extensions: the example of Croatian root concluded that the contribution of TPs in lexical tial fragment and the second part of the word recognition is not the same across phonological (e.g. per-perdente ‘loser’). Together with the kus/kuš and Turkish root tat environments. Consonant clusters might play a current experiment, the experiment on prefixed particularly relevant role in lexical access, com- and pseudo-prefixed words will determine Ida Raffaelli Barbara Kerovec pared to CV sequences, as contemporary theories whether or not the role of TPs is different when Faculty of Humanities and Social Sciences Faculty of Humanities and Social Sciences based on the principles of phonological and mor- the target is a simplex word compared to when it Ivana Lučića 3, 10000 Zagreb, Croatia Ivana Lučića 3, 10000 Zagreb, Croatia phological naturalness also seems to predict (see is a prefixed word, and to when it is a pseudo- [email protected] [email protected] e.g. Dressler & Dziubalska-Kolaczyk, 2006; Ko- prefixed word. Different hypotheses may be put recky-Kroell et al. 2014). forward here, according to whether or not mor- Additionally, for CC sequence the token fre- phological boundaries affect the processing of This paper deals with the concept of 'taste' and structures of Croatian root kus/kuš “taste” and quencies (of the bigram and of the prime + C consonant clusters (e.g., Calderone et al. 2014, its importance in the formation of Croatian Turkish root tat “taste”. The model of sequence) turned out to be relatively more im- Celata et al. 2015 in press), and according to the and Turkish lexicon. ‘Taste’ as one of five morphosemantic patterns (MP model) as portant than the corresponding type frequencies, likelihood that a given sequence occurs as mor- basic sensory concepts serves as a source developed by Raffaelli and Kerovec (2008) thus suggesting that the exposure to the number pheme or as homographic non-morphological domain in conceptualizing various abstract and Raffaelli (2013) regards the lexicon as of occurrence of a cluster or of a segment se- pattern (see Laudanna et al., 1994). domains, mostly related to human internal morphologically and semantically related, i.e. quence may be more important in lexical access By describing phonotactic probability and fre- sensations (Sweetser, 1990). However, within each motivated lexeme is related to a root with than the exposure to the individual items contain- quency effects during word recognition, this ing them. study offers arguments to models of lexical ac- the research of perception vocabulary, lexical respect to the word-formation processes and to An additional issue concerns the role of TPs in cess based on bottom-up processes such as co- structures related to the concept of 'taste' have the semantic (cognitive) processes. Moreover, morphologically complex words. According to hort models for orthographic stimuli (see e.g. been among the least investigated areas, the MP model regards the lexicon as a some models, morphological parsing is necessary Johson & Pugh, 1994). The property of single especially according to different parts of constructional continuum with no clear-cut for lexical access and the prefix (in the case of consonants to predict the following segment then prefixed words) has to be stripped away in order speeding up the recognition of the whole word, speech and their correlation in building of boundaries between grammatical and lexical vocabulary. A comparative analysis of the structures (cf. Langacker, 1987; Goldberg, for the word to be recognized (from Taft & For- as an additional if not independent way to access taste vocabulary in two typologically different 1995; Booij, 2010). It means that ster, 1975 onwards). Assuming a condition in words and their subparts, might also be discussed which the fragment prime coincides with a pre- with reference to models that associate ortho- and genetically unrelated languages like constructions such as okušati se “to try; to fix, TPs would play the additional role of mark- graphic input units to semantic and lexical Croatian and Turkish could reveal the give it a go”, okušati se u “to try out (a certain ing the morphological boundary during the prim- knowledge (from connectionist models such as in differences and similarities in processes that activity)” and okušati se kao “to try (out) as” ing event. According to the results of the current Harm & Seidenberg, 1999, to amorphous models come into play in building their vocabulary. are regarded as separate lexical units since study, it appears to be of utmost importance to such as in Baayen et al. 2011). This is the reason why these two languages they differ with respect to their usage, and further verify whether prefixed and pseudo- References are chosen for the analysis. According to the exhibit differences in their meanings and their prefixed words behave in the same way. In fact, models postulating morphologicl pre-parsing embodiment hypothesis within Cognitive syntactic realizations. The MP model is a Harald R. Baayen. 2010. A real experiment is a facto- (e.g. Schreuder & Baayen, 1995) would suggest Linguistic theoretical framework, it can be usage based model, thus conclusions about rial experiment? The Mental Lexicon, 5(1): 149- that high TPs will codetermine latencies for pre- 157. expected that Croatian and Turkish share lexical structures and meanings are based fixed targets only, while if morphology does not Harald R. Baayen, Petar Milin, Dusica Filipovic Dur- conceptual extensions towards the same upon a detailed analysis of lexical realizations affect word recognition, then the TPs between devic, Peter Hendrix and Marco Marelli. 2011. An abstract domains. However, since the two in different contexts. the fragment prime and the following segment amorphous model for morphological processing in languages are typologically different and Meanings and contextual realizations of all composing the target will modulate latencies in visual comprehension on naive discriminative lear- immersed in different cultures, some analyzed lexical units in Croatian and in prefixed and pseudo-prefixed words to the same ning. Psychological Review, 118: 438-482. extent. differences in conceptual mappings are also Turkish have been checked in the Croatian A follow-up experiment will therefore test the Pier Marco Bertinetto, Cristina Burani, Alessandro expected. Thus, one of the main goals of the National Corpus, Croatian Web Corpus and Laudanna, Lucia Marconi, Daniela Ratti, C. Ro- contribution of phonotactic statistical knowledge lando and Anna Maria Thornton. 2005. Corpus e present research is to provide a more fine METU Turkish Corpus. in native speakers’ access to complex word Lessico di Frequenza dell’Italiano Scritto CoL- grained analysis of semantic extensions of the As pointed out by Viberg (1984), concept forms (specifically, prefixed nouns). Prefixed FIS). http://linguistica.sns.it/CoLFIS/Home.net taste vocabulary in the two languages. Besides of 'taste' is in general extended towards and pseudo-prefixed words will be used for that purpose. In particular, fragment primes will be Basilio Calderone, Chiara Celata, Katharina Korecky- examining similarities and differences in domains 'like'/'dislike'. Moreover, some cross- Kroell and Wolfgang U. Dressler. 2014. A compu- selected according to two different conditions: in conceptual mappings, the aim of the paper is linguistic evidence (cf. Viberg, 1984; Evans tational approach to (mor)phonotactics: Evidence condition a) the targets are prefixed words and also to see to what extent the two languages and Wilkins 2000) shows a regular and from German. Language Sciences, 46 (part A): 59- the fragment prime coincides with the prefix 70. differ with respect to lexicalization patterns frequent extension of taste verbs towards the (e.g. bis-bisnonna ‘grandmother’); in condition that influence formation of the ‘taste’ meanings “to try”, “to experience”, “to enjoy“. b) the targets are pseudo-prefixed words and no Chiara Celata, Katharina Korecky-Kroell, Irene Ricci, and Wolfgang U. Dressler. 2015 (in press). Online morphological boundary occurs between the ini- vocabulary. Although some cross-linguistic regularities of processing of German (mor)phonotactic clusters by Croatian and Turkish taste vocabularies are conceptual extensions of the concept 'taste' described with respect to the morphosemantic have already been established, the Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

158 27 adults and adolescents. Italian Journal of Lingui- Michael Vitevitch, Paul A. Luce, David B. Pisoni and discussed by cognitive scholars: (a) the trajectors Şeyda Özçalışkan. 2003. Metaphorical motion in stics, 27(1). Edward T. Auer. 1999. Phonotactics, neighborhood and verbs involved depart from those typically crosslinguistic perspective: A comparison of Eng- activation and lexical access for spoken words. lish and Turkish. Metaphor and Symbol, 18 (3): Wolfgang U. Dressler and Katarzyna Dziubalska- described in fictive motion, and (b) the construc- Brain and Language, 68: 306-311. 189-228. Kolacyk . 2006. Proposing Morphonotactics. Italian tions dealing with buildings and wines do not Journal of Linguistics, 18: 249-266. comply with the unidirectional concrete-onto- Şeyda Özçalışkan. 2004. Encoding the manner, path, ground components of a metaphorical motion Katharina Korecky-Kroell, Wolfgang U. Dressler, abstract quality of the metaphorical mappings event. Annual Review of Cognitive Linguistics, 2: Eva Maria Freiberger, Eva Reinisch, Karlheinz described in, for instance, the expression of fi- 73-102. Moerth and Gary Libben. 2014. Phonotactic and nancial issues or emotions, but involve concrete morphonotacti c processing in German-speaking sources and targets. This suggests that fictiveness Şeyda Özçalışkan. 2005. Metaphor meets typology: adults. Language Sciences, 46 (part A): 48-58. as opposed to metaphoricity may be a question of Ways of moving metaphorically in English and Turkish. Cognitive Linguistics, 16 (1): 207-246. N.F. Johnson and K.R. Pugh. 1994. A cohort model of degree, yet this can only be ascertained by con- visual word recognition. Cognitive Psychology, 26: sidering all the factors underlying the use of mo- Şeyda Özçalışkan. 2007. Metaphors we move by: 240-346. tion constructions in communication — from the Children’s developing understanding of metaphori- trajectors involved to the reasons motivating cal motion in typologically distinct languages. Alessandro Laudanna, Cristina Burani and Antonella their use. Third, while English and Spanish differ Metaphor and Symbol, 22 (2): 147-168. Cermele . 1994. Prefixes as processing units. Lan- in the expression of real motion events, their dif- guage and Cognitive Processes, 9, 295-316. Dan I. Slobin. 1996. Two ways to travel: Verbs of ferences are less dramatic in the expression of motion in English and Spanish. In M. Shibatani Alessandro Laudanna and Giulia Bracco. 2006. Stem figurative motion which, again, points to the im- and S. A. Thompson (eds.), Grammatical Con- and fragment priming on verbal forms of Italian. In pact of culture and genre in the language use. structions: Their Form and Meaning, 195-219. Proceedings of the 5th International Conference on New York: Oxford University Press. the Mental Lexicon (Montreal, Canada, 11-13 Oc- Acknowledgements tober, 2006): 26. Dan I. Slobin. 2004. The many ways to search for a This research has been supported by ESF Short frog. In S. Strömqvist and L. Verhoeven (eds.), Re- Paul A. Luce and Nathan R. Large. 2001. Phonotac- Visit Grants to both authors (NetWords 09-RNO- lating Events in Narrative. Typological and Con- tics, density, and entropy in spoken word recogni- 089, European Science Foundation) and by the textual Perspectives, 219-257. Hillsdale, NJ: Law- tion. Language and Cognitive Processes, 16: 565- Spanish Government (MovEsII, FFI2013-45553- rence Erlbaum. 581. C3-2-P; FFI2013-45553-C3-1-P). Leonard Talmy. 1991. Path to realization: A typology Sven L. Mattys and Peter W. Jusczyk. 2001. Phono- of event conflation. Proceedings of the Seventeenth tactic cues for segmentation of fluent speech by in- References Annual Meeting of the Berkeley Linguistics Society, fants. Cognition, 78: 91-121. Vijay K. Bhatia. 1999. Integrating products, process- 17: 480-519. Mark Pitt and James McQueen. 1998. Is compensa- es, purposes and participants in professional writ- Leonard Talmy. 2000. Toward a Cognitive Semantics. tion for coarticulation mediated by the lexicon? ing. In C. Candlin and K. Hyland (eds.), Writing: Cambridge, MA: MIT Press. Journal of Memory and Language, 39: 347-370. Texts, Processes and Practices, 21-40. London: Robert Schreuder and Harald R. Baayen. 1997. How Longman. simplex complex words can be. Journal of Memory Ronald W Langacker. 1986. Abstract motion. Pro- and Language , 37: 118-139. ceedings of the Twelfth Annual Meeting of the Holly L. Storkel. 2001. Learning nonwords: Phono- Berkeley Linguistics Society, 12: 455-471. tactic probabilities in language development. Jour- Teenie Matlock and Till Bergmann. In press. Fictive nal of Speech, Language, and Hearing Research, Motion. In E. Dabrowska and D. Divjak (eds.), 44: 1321–1337 Mouton Handbook of Cognitive Linguistics. Berlin: Marcus Taft and Kenneth I. Forster. 1975. Lexical Mouton de Gruyter. storage and retrieval of prefixed words. Journal of Yo Matsumoto. 1996a. Subjective motion and English Verbal Learning and Verbal Behavior, 14: 638- and Japanese verbs. Cognitive Linguistics 7, 183- 647. 226. Michael Vitevitch, Paul Luce, J. Charles-Luce and D. Yo Matsumoto. 1996b. How abstract is subjective Kemmerer. 1997. Phonotactics and syllable stress: motion? A comparison of access path expressions Implications for the processing of spoken nonsense and coverage path expressions. In A. Goldberg words. Language and Speech, 40: 47–62. (ed.), Conceptual Structure, Discourse and Lan- Michael S. Vitevitch and Paul A. Luce. 1999. Proba- guage, 359-373. Stanford: CSLI Publications. bilistic phonotactics and neighborhood activation Michael W. Morris, Oliver J. Sheldon, Daniel R. in spoken word recognition. Journal of Memory & Ames and Maia J. Young. 2007. Metaphors and the Language, 40: 374-408. market: Consequences and preconditions of agent and object metaphors in stock market commentary. Organizational Behavior and Human Decision Processes, 102: 174-192.

28 157 Zipfian discrimination in which motion is performed) present in the ex- (e.g. hobble, sally forth, waltz…), hence showing amples. Figure 1 illustrates the coding. the creativity and –almost– endless possibilities Jim Blevins Petar Milin of this language in this respect. University of Cambridge University of Novi Sad With respect to our second goal, we found that [email protected] Eberhard Karls Universität Tübingen knowledge of the genre where the expressions [email protected] are used is critical to correctly understand and Figure 1: Example of corpus coding. explain metaphorical motion instances. This is Michael Ramscar particularly salient when comparing the use of Eberhard Karls Universität Tübingen 4 Results the same verb in three different genres: indeed, a [email protected] As far as our first goal is concerned, our results single verb may foreground aspects of a given show that the lexicalization and rhetorical pat- situation irrelevant in a different context. For instance, the verb tumble in (2): This talk outlines how form variation can be mod- of a specific form/meaning contrast becomes dis- terns described for Spanish and English are elled in terms of equilibria between two domi- criminated from the form classes that express sim- maintained in the specific contexts explored, and nant communicative pressures. The pressure to ilar contrasts. Thus all learning serves to increase therefore, results are congruent with research (2a) architecture discriminate forms of a language enhances differ- the level of suppletion in form-meaning mappings. done on metaphorical motion events in general A stair tumbles down from this first floor incision contexts. However, the data also yield interesting onto the man-made island. ences between expressions. Unchecked, this pres- Moreover, standard cases of ‘suppletion’ are insights: metaphorical motion instances found in sure can in principle lead to suppletion of the kind merely extreme instances of discriminative con- specific contexts are more expressive and abun- (2b) wine reported in languages such as Yélî Dnye (Hen- trasts that seem ubiquitous at the sub-phonemic dant with regard to Manner than what is the case The fruit shows well-ripened apples and peaches derson ). However, in most languages, the level. In the domain of word formation, Davis in general uses of language. This is particularly all the way into pineapples and mangoes, offer- pressure towards maximally discriminative expres- et al. () found suggestive differences in dura- noteworthy in the Spanish data, whose expressiv- ing up a cascade of flavors that tumble across sions is countered by the need to extrapolate from tion and fundamental frequency between a word ity contrasts with the general tendency to omit the palate. sparse input. It has long been known that corpora like captain and a morphologically unrelated on-

Manner and other details of motion events in provide only a partial coverage of the forms of a set word such as cap. Of more direct relevance (2c) tennis other contexts. For instance, examples such as language (inflectional and derivational). This talk are studies of inflectional formations. Baayen et al. those in (1) are frequently used in our corpus: Andy Murray has been sent tumbling out of AO 2008 by Frenchman Tsonga presents evidence that the shortfall is far greater () found that a sample of speakers produced and far more systematic than previously appreci- Dutch nouns with a longer mean duration when (1a) architecture ated, and that the coverage of the form variation re- they occurred as singulars than as when they oc- La senda de exhibiciones de arte nurágico se The property of tumble shared by all these exam- mains sparse in corpora of up to one billion words. curred as the stem of the corresponding plural. In desliza entre ambas pieles del edificio permitien- ples is ‘uncontrolled’, but this lack of control has The sampling reported in this talk suggests that the do una visualización más íntima de las obras a different interpretation in each genre. Thus, a follow-up study, Kemps et al. () tested speak- ‘The exhibition path of nuragic art slides be- although in (2a) tumble suggests a certain lack of forms in a corpus or encountered by a speaker ex- ers’ sensitivity to prosodic differences, and con- tween the two skins of the building allowing a order, the main concern of the verb is to convey hibit a Zipfian distribution at all sample sizes. cluded that “acoustic differences exist between un- more intimate visualization of the works’ the visual force of the stair thus described, which The interaction of these pressures also accounts inflected and inflected forms and that listeners are somehow overwhelms those gazing at it. In (2b), for the role of lexical neighbourhoods. Since most sensitive to them” (Kemps et al. : ). Recent (1b) wine the ‘uncontrolled’ property does not suggest a paradigms will be only partially attested, the orga- studies by Plag et al. () find similar contrasts En boca tiene una magnifica entrada, suave, sa- certain disorder or chaos of a wine’s gustatory nization of paradigms into neighbourhoods pro- between phonemically identical affixes in English. broso y equilibrado […], aunque en el paso so- properties; rather, it expresses a sensory over- vides an analogical base for extrapolation. bresalen rasgos vegetales y se precipita hacia un flow or gustatory richness perceived by this critic The role of discriminability as a positive trait of a complex wine. Finally, in final en el que predominan notas tostadas y The status of regularity (2c) the verb not only conveys Tsonga’s con- From a discriminative perspective, it is regularity amargas ‘Smooth, tasty and balanced, it enters the mouth vincing win, but Murray’s pain and shame when It is usually assumed that regularity in a linguistic that stands in need of explanation. Learning mod- powerfully […] although some vegetal notes losing to an inferior player ranking-wise. system is desirable or normative and that supple- els offer a solution here as well. Unlike derivational peek mid journey and it plunges towards a finish Examples like these are interesting in three re- tion and other irregularities represent deviations processes, inflectional processes are traditionally where toasty and bitter notes predominate’ spects. First, although the information conveyed from the uniform patterns that systems (or their assumed to be highly productive, defining uniform by motion verbs may be perfectly obvious for speakers) strive to maintain. From a discrimina- paradigms within a given class. Lemma size is thus (1c) tennis architects, tennis fans and wine aficionados and tive perspective, the situation is exactly reversed. not expected to vary, except where forms are un- Murray se pasea en el ágora de Valencia critics, this may not be the case for people out- To the extent that patterns like suppletion enhance available due to paradigm ‘gaps’ or ‘defectiveness’. side these communities. Hence, the need to un- ‘Murray strolls in the agora in Valencia’ the discriminability of forms, they contribute to the Yet corpus studies suggest that this expectation derline the importance of bringing the notion of communicative efficiency of a language. In a dis- is an idealization. Many potentially available in- This expressivity is more outstanding in the case acculturation to the centre of metaphor research, i.e. the relevance of taking into account all the criminative model, such as that of Ramscar et al. flected forms are unattested in corpora. As corpora of English: the data from the specific corpus not (), the only difference between overtly supple- increase in size, they do not converge on uniformly only reinforce the high expressivity and richness factors that shape a given culture and its charac- tive forms such as mouse/mice and more regular populated paradigms. Instead, they reinforce pre- of this language with regard of Manner, but add teristic genres within a broader cultural panora- forms such as rat/rats is that the former serve to ac- viously attested forms and classes while introduc- novel verbs to those susceptible to being used in ma. Second, they problematize some of the the description of motion events in other contexts views on both fictive and metaphorical motion celerate the rate at which a speakers’ representation ing progressively fewer new units. As shown in

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March -April , , published at http://ceur-ws.org

156 29 Figure , the number of attested inflected noun References variants decreases in all random samples, ranging From physical to metaphorical motion: A cross-genre approach Baayen, R. H., Feldman, L. B. & Schreuder, R. from -million to -million hits, at which point (). Morphological influences on the recog- the -million word StdeWaC corpus is essen- Rosario Caballero Iraide Ibarretxe-Antuñano nition of monosyllabic monomorphemic words. tially exhausted. As sample size increases, there is Universidad de Castilla-La Mancha (Spain) Universidad de Zaragoza (Spain) Journal of Memory and Language , –. a marked attenuation in the steepness of the slope [email protected] [email protected] steepness, though it never becomes completely flat. Baayen, R. H., McQueen, J. M., Dijkstra, T. & This trend is extracted and presented in Figure , Schreuder, R. (). Frequency effects in reg- which plots number of attested forms on the X- ular inflectional morphology: Revisiting Dutch context of use is critical for the study of motion axis and slopes of six trends from Figure  on the plurals. In Baayen, R. H. & Schreuder, R. (eds.), 1 Introduction patterns –whether real, fictive or metaphorical– and, above all, their correct interpretation. Y-axis. From this relationship we can infer that Morphological Structure in Language Processing, Talmy (1991, 2000) classifies languages into Berlin: Mouton de Gruyter, –. even if the corpus size were increased to infinity, verb-framed and satellite-framed types according 2 Research questions it would never contain all possible inflected forms Davis, M., Marslen-Wilson, W. D. & Gaskell, M. to whether the Path of a motion event is lexical- of every German noun. As shown in Figure , the (). Leading up the lexical garden-path: Seg- ized as a satellite of the main verb in the clause In this talk, we discuss the lexicalization patterns forms of a language obey Zipf’s law at all sample mentation and ambiguity in spoken word recog- or as the verb itself. Thus, in English (and other of metaphorical events in genre-specific texts in sizes. Speakers must be able to extrapolate from a nition. Journal of Experimental Psychology: Hu- S-languages like Dutch or Danish) verbs often English, a satellite-framed language, and Span- encode rich information concerning Manner, ish, a verb-framed language. More concretely, partial – often sparse – sample of their language, man Perception & Performance , –. and regular patterns subserve this need. Cause and/or Movement but need a so-called we explore whether (a) the lexicalization and Gahl, S., Yao, Y. & Johnson, K. (). Why re- satellite to convey the Path of motion. In con- rhetorical differences between Spanish and Eng- duce? Phonological neighborhood density and trast, in Spanish and Romance languages in gen- lish discussed in the motion literature are sus- phonetic reduction in spontaneous speech. Jour- It takes a neighbourhood eral, verbs are mainly concerned with trajectory tained in genres other than narratives, and (b) the nal of Memory and Language (), –. or Path, and any other additional information idiosyncrasy of those genres has any typological (Manner or Cause of motion) is expressed by implications and, at the same time, affects the In order for a collection of partial samples to al- Henderson, J. E. (). Phonology and Grammar means of sentence constituents playing an adver- expressions’ creativity and expressiveness. of Yele, Papua New Guinea. Pacific Linguistics B- low the generation of unattested forms, the forms bial role. As a result, speakers of verb-framed , Camberra: Pacific Linguistics. that speakers do know must be organized into sys- and satellite-framed languages appear to exhibit 3 Methodology tematic structures that collectively enable the scope Hockett, C. F. (). The Yawelmani basic verb. different rhetorical styles when describing the We use a 600.000-word corpus comprising ten- of possible variations to be realized. These struc- Language , –. same motion event (Slobin, 1996, 2004). nis, wine and architecture reviews written in the- tures correspond to lexical neigbourhoods, whose Together with dealing with real motion, Kemps, J. J. K., Rachèl, Ernestus, M., Schreuder, R. se two languages. These genres (or genre colony effects have been investigated in a wide range of Talmy’s work has provided the starting point in & Baayen, R. H. (). Prosodic cues for mor- (Bhatia, 2000)) fall within reviewing practices: psycholinguistic studies (Baayen et al. ; Gahl research on (a) fictive motion, i.e. the dynamic phological complexity: The case of Dutch plural their main goal is to describe and evaluate an et al.  ). From the present perspective, neigh- predication of physical yet static entities such as event (a tennis match) or an entity (wine and nouns. Memory & Cognition (), –. roads or cables, as in The road climbs over the bourhoods are not independent dimensions of lex- buildings) for an audience that may or may not Milin, P., Keuleers, E. & Filipović Đurdjević, hill (Langacker, 1986; Matsumoto, 1996a; ical organization but, rather, constitute the cre- have any previous knowledge about them, yet is D. (). Allomorphic responses in Serbian Talmy, 2000; Matlock and Bergmann, in press), ative engine of the morphological system, permit- interested in having an assessment written by a and (b) metaphorical motion, i.e. the dynamic ting the extrapolation of the full system from par- pseudo-nouns as a result of analogical learning. knowledgeable source. The texts were searched predication of abstract entities such as the econ- tial patterns. Interesting support for this perspec- Acta Linguistica Hungarica , –. by hand in order to identify the motion construc- omy, emotions, and the like as in Jealousy tive comes from the study reported in Milin et al. tions used in them. The unit of analysis was any Plag, I., Homan, J. & Kunter, G. (). Ho- snaked its way into our relationship (Özçalışkan, instance concerned with motion –figurative or (). In this study, analogical extrapolation from mophony and morphology: The acoustics of 2003, 2004, 2005, 2007; Morris et al., 2007). In otherwise. A second step involved cleaning the a small set of nearest neighbors allowed a system to word-final S in English. Ms, Heinrich-Heine- general, research on fictive and metaphorical texts and converting them into machine readable model the choice of masculine instrumental singu- Universität, Düsseldorf. motion has focused on the way the speakers of in order to run a concordance program and count lar allomorph by Serbian speakers presented with different languages typically describe motion Ramscar, M., Dye, M. & McCauley, S. M. (). the verb types and number of instances (tokens) nonce words. Regular paradigms thus enable lan- events in everyday, general contexts. Although Error and expectation in language learning: The in the three sub-corpora. After identifying the guage users to generate previously unencountered yielding interesting results for the overall charac- curious absence of mouses in adult speech. Lan- verbs used in the three genres, they were classi- forms, not because they are the product of an ex- terization of languages, this may result in a de- guage (), –. fied into two main groups in agreement with two plicit rule, or of any kind of explicit grammatical gree of overgeneralization towards the phenome- criteria. First, the semantic information of verb non at issue. This is reinforced by the way in knowledge, but rather they are implicit in the dis- involved (motion1-when the verb includes mo- which the data illustrating the research claims are tribution of forms and semantics in the language as tion information in its semantic description and often presented: the examples often appear in a a system, much as suggested by Hockett (: ). motion2-when the verb, despite not being a mo- decontextualized manner, with scarce or no men- tion verb per se, can be reinterpreted as such due tion to the characteristics of the discourse context to the construction it is used in) and, second, the or event where they are used, typically, the dis- motion elements (Path-the trajectory or course in his analogizing … [t]he native user course genre where they occur). This is unfortu- followed by the moving object, Manner-the way of the language … operates in terms of nate since the inclusion and description of the all sorts of internally stored paradigms, many of them doubtless only partial Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

30 155 References Dagmar Divjak. 2010. Structuring the lexicon: a clus- tered model for near-synonymy. Berlin: de 12.5 Gruyter. sampleSize

Gries Stefan Th. & N. Otani. 2010. Behavioral pro- 10.0 1M files: a corpus-based perspective on synonymy and 3M antonymy. ICAME Journal, 34:121–150. 6M 9M Jones Steven, M.L. Murphy, Carita Paradis & Caro- 7.5 12M

line Wil lners. 2012. Antonyms in English: Con- Log−count of nouns 15M struals, constructions and canonicity. Cambridge University Press, Cambridge, UK. 5.0

Anna Lobanova. 2012. The Anatomy of Antonymy: A Corpus-Driven Approach. Dissertation, University of Groningen. 1 2 3 4 Number of noun infl. variants Carita Paradis. 2005. Ontologies and construals in lexical semantics. Axiomathes,15:541–573. Figure : The paradigm non-filling pattern Carita Paradis , Caroline Willners & Jones Steven. 2009. Good and bad opposites: using textual and psycholinguistic techniques to measure antonym canoni city. The Mental Lexicon, 4(3): 380–429. −1.5 Carita Paradis, Simon Löhndorf , Joost van de Weijer & Caroline Willners. 2015. Semantic profiles of antonymic adjectives in discourse. Linguistics, −2.0 53.1: 153 – 191. Roehm, D., I. Bornkessel-Schlesewsky, F. Rösler & M. Schlesewsky. 2007. To predict or not to predict: −2.5 Influences of task and strategy on the processing o f semantic relations. Journal of Cognitive Neuro-

log−count of nouns Slope estimates for science, 19 (8):1259–1274. −3.0 Debela Tesfaye. & Michael Zock. 2012. Automatic 1M 3M 6M 9M 12M 15M Extraction of Part-whole Relations. In Proceedings Number of forms of the 9th International Workshop on Natural Lan- guage Processing and Cognitive Science. Figure : Asymptoting slopes Michael Zock. & Debela Tesfaye. 2012. Automatic index creation to support navigation in lexical graphs encoding part of relations. Proceedings of the 3rd Workshop on Cognitive Aspects of the Lex- icon (CogALex-III), COLING 2012. 8M

Sample sizes (and number of hapax legomena): 6M 1M (1107) 3M (2305) 6M (3187) E[Vm] 9M (8035) 4M 12M (8633) 15M (7365)

2M

1 2 3 4 5 6 7 8 9 10 11 12 ...... m

Figure : Zipf plot for randomly sampled words

154 31 Effects of processing complexity in perception and production. The case of  The strength of the co-occurrence de- 4 Comparison with related works English comparative alternation pends on the domain: slow: fast in the domains of growth, lines , motion, Previous research has shown that there are anto- movement, speed ,trains, music, pitch; nyms that are strongly opposing (canonical anto- Gero Kunter slow: quick in the domains of time, nyms) (Paradis et al. 2009, Jones et al. 2012). English Language and Linguistics march, steps; slow: gradual in the do- Such antonyms are very frequent in terms of co- Heinrich-Heine-Universitat¨ Dusseldorf¨ mains of process, change, transition; occurrence as compared to other antonyms: [email protected] small: big in the domains of screen, small: large as compared with small: big. In this experiment we found that the canonical anto- band; small: large in the domains of in- testine, companies, businesses; week: nyms are the set of antonyms the domains in strong in the domains of force, interac- which they function were numerous and produc- Abstract For instance, an adjective that is morphologically tion, team, ties, points, sides, wind. tive. For instance the number of domains for complex is assumed to be also cognitively more The Synonyms: small: large (11704) is by far greater than for This paper discusses the effect of pro- complex than a simplex adjectives, and in order to  Co-occurred in the same sentence but small: big (120). However this doesn’t make the cessing complexity on the English com- compensate for this increased cognitive complex- mainly in different domains. For in- antonym small: large more felicitous in all the parative alternation. The reported exper- ity, speakers may prefer the analytic comparative stance, fast: quick, strong: heavy. Few domains. Small: big are the most felicitous anto- iments show a processing advantage of the over the synthetic alternative. co-occurrences in the same sentences in nyms for the domains such as screen, band as synthetic comparative in perception, but a the same domains as exhibited by the compared to small: large. preference of the analytic comparative in Yet, there is only little psycholinguistic research pairs gradual: slow in the domains of Measuring the strength of antonyms without sentence production if the base adjective is that investigated this assumed processing advan- process, change, development. taking domains into account provided higher values for the canonicals as they tended to be cognitively complex. These results imply tage of analytic forms. A notable exception is  The strength of the synonym co- used in several domains. If domains were taken that perceptual complexity and complex- Boyd (2007, ch. 2) who conducted a self-paced occurrence depends on the domains. For in to account, as we did in this experiment, all reading experiment to investigate processing dif- instance, the synonyms strong: heavy in ity in production have diverging effects on the antonyms were strong in their specific do- ferences between synthetic and analytic compara- wind and rain domains respectively to the English comparative alternation. More mains. The antonym pair small: large had higher express intensity; the synonyms large: generally, the paper calls for a fine-grained tives. Indeed, he reports shorter reaction times for value without considering domain in to account wide in the domains of population and look at the role of processing complexity the sentences containing analytic comparatives, yet had 0.29 value in the domain of screen where distribution domains respectively; gra- in areas of morphosyntactic variation. but due to the experimental design, this evidence small: big has much higher value (0.71). The dual: slow in the domains of process, is only indirect and allows for alternative interpre- values were calculated taking the frequency of change, development; small: low in the 1 Introduction tations. As yet, then, there is only limited empiri- co-occurrence of the domain term (screen in this domain of size cost, range, size weight, cal evidence for the assumption that analytic com- case) with each antonyms and dividing it by the area, size price, amount density; micro: Most English comparatives are formed using ei- paratives are easier to process than synthetic com- summation of the frequency of co-occurrence of small in the domains of enterprises, ther a synthetic form (e.g. easier) or an analytic paratives. In addition, as pointed out by Mondorf the domain term (again screen in this case) with businesses, entrepreneurs.. form (e.g. more important). While most adjec- (2014, 201), it is still an unresolved issue whether both antonyms (small big and small large). tives clearly prefer either the synthetic or the an- more-support is a response to increased processing 3.2 The variant domain dependent co- alytic comparative, there is a considerable num- loads in production or in perception. occurrence method 5 Conclusion ber of adjectives which frequently take both forms, As mentioned before, the variant domain depen- The strength of the antonyms/synonyms varied in e.g. more friendly vs. friendlier. The decision This paper addresses these two issues. First, dent co-occurrence extraction algorithm mines relation to the domains of instantiation. The use for either form is influenced by several phonologi- it presents the results from a perception experi- the patterns of co-occurrence information of the of antonyms and synonyms was very consistent cal, morphological, syntactic and semantic factors. ment which tested whether analytic comparatives synonyms and antonyms in different sentences. with few overlaps across the domains. Similar For example, the probability of analytic compara- are indeed easier to process for listeners. Con- The result from the variant co-occurrence expe- results were observed in both experiments from tives increases with the number of morphemes in trary to this hypothesis, the reaction times show riment showed hardly any differences in the do- the domain perspective although with significant the adjective base. It is also higher if the com- that analytic comparatives have a processing dis- mains with which the synonyms and antonyms differences in frequency. Antonyms frequently parative is in predicative than in attributive posi- advantage in perception. Then, a production ex- are associated. Strong in the domains of influ- co-occurred in the same domains in the same tion, and it decreases with an increasing compara- periment is discussed which elicited spoken sen- ence, force, wind, interactions, evidence, ties; sentences and synonyms co-occurred in different tive/positive ratio (see Szmrecsanyi 2005, Hilpert tences containing a comparative construction. The Heavy in the domains of loss, rain, industry, traf- domains in the same sentences (with less fre- 2008 and Mondorf 2009 for detailed discussions). analysis reveals that the processing complexity is fic; gradual: slow in the domains of process, quency) and more frequently in different sen- Mondorf (2009) argues that these factors are all a significant predictor of the comparative alterna- change, transition. However, we observed that tences in the same domains. the frequency of co-occurrence differed signifi- part of a more general, audience-oriented com- tion: with increasing complexity of the base adjec- Acknowledgments pensatory mechanism called more-support: if the tive, the probability of analytic comparatives in- cantly. For instance, the frequency of the pair cognitive complexity of the adjectival base or its creases. Thus, the paper argues that speakers and gradual: slow was 76 in same sentences experi- We acknowledge European Science Foundation ment but 1436 in the variant co-occurrence expe- environment increases, speakers prefer the ana- listeners process the English comparative variants (ESF) for providing us the funding to undertake riment. lytic comparatives, because they have a processing differently, and that it is the speaker who benefits this work. advantage over the corresponding synthetic form. from a compensatory use of more comparatives.

Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

32 153 reaction time 2 Comparative variation in perception −0.50 strong wind rain wind rain 86 tonyms in different sentences, because we ex- −0.55 Analytic Analytic heavy winds snow- winds snow- 3 pected synonyms to be applicable to different, 2.1 Method −0.60 −0.65 Analytic fall fall rather than the same contexts, since complete Synthetic 31 native speakers of Canadian English partici- −0.70 Synthetic winds rainfall winds rainfall 34 overlap of meanings of words are rare or even −0.75 pated in an auditory decision task in which they Synthetic waves rain- waves rainfall 4 non-existent. This way we were able to gain in- −0.80 had to decide whether the acoustic stimuli was an fall formation indirectly about their use by extracting −4 0 2 −0.5 0.5 1.0 −1 0 1 2 Preceding RT PLD20 No. of phonemes their co-occurrence when they appear separately existing English form. The set of stimuli contained (residualized) (residualized) Table 2. The matrix of the frequencies of terms co- in different sentences while still being instan- the analytic and synthetic comparative form for 60 −0.50 −0.55 occurring with sample antonyms and the associated Analytic Analytic tiated in the same domain. We mined the co- adjective types with at least 5 attestations for both −0.60 Synthetic potential domain concepts occurrence information of the synonym/antonym forms in the Corpus of Contemporary American −0.65 Synthetic pairs separately in all possible domains and English (Davies 2008-). The stimuli were pro- −0.70 2.4 Extracting co-occurrences frequency −0.75 check if they co-occurred in the same sorts of specific to a given Domain/Context duced by a male speaker of Canadian English with −0.80 domains: phonetic training. He was instructed to produce 2 4 6 8 10 1 3 5 7 Synthetic frequency Analytic frequency The algorithm calculated the co-occurrence fre-  X(y, f) the stimuli in citation form with a single accent on quency of the antonyms/synonyms with the dif-  Z(y, f) the primary stressed syllable of the base adjective ferent concepts they refer to (or modify) as pre- Where, in both types of stimuli. Accordingly, more was Figure 2: Partial effects of significant interactions sented in table 3 by combining the information X and Z are a pair of a given an- produced stressed, but unaccented. of Class on reaction times obtained in section 2.3 and section 2.4. tonym/synonym, Y is the domain within Alongside the 2 60 = 120 synthetic and ana- which the pairs of the antonym/synonym ×

lytic comparatives, the set of stimuli also included

co-occur and f the frequency of the x-y

s The density estimate suggests that reaction cy

360 distractors. Some of the distractors combined

n or z-y co-occurrence. more with non-existing words, others combined times are, in general, higher for analytic compar-

nym The frequency of a pair of the anto- o

que atives than for synthetic comparatives. This vi- main

e nyms/synonyms in the Y domain was counted the adjective bases with the illegal suffix -ic. In o

r sual interpretation is supported by a linear mixed-

Concept 1 Concept 2 Concept F D Ant and the same applies to the other pair. This made addition, the set of distractor items contained non- effects regression model with reaction times as hot sum- winter 10 temper- it possible to measure the degree of co- existing words ending in -er as well as existing cold mer 5 ature occurrence of the antonym/synonym pairs from words and complex words. Examples of the test the dependent variable (in order to fulfill the lin- earity assumption of the linear model, the reac- strong wind rain 11 winds the domain perspective indirectly. stimuli are given in (1a), and distractor examples tion times were power-transformed with λ = -1.52, heavy winds snowfall 2 rain are given in (1b). 3. Results and discussion see Box and Cox 1964). The main predictor was winds rainfall waves rainfall 3.1 Co-occurrences in the same sentence (1) a. colder, happier, yellower the factor Class (with values Synthetic and Ana- more cold, more wealthy, more yellow lytic). Additional predictors addressed several in- Based on the results of the experiment the fluences that may be expected affect the reaction Table 3. T he frequency of sample antonym specific to strength of the antonyms/synonyms varies in re- b. coldic, more gorsty, rilker times: the subject-specific variables Handedness, the underlying domains ∗ ∗ ∗ lation to the domains of instantiation. Hence, the on wire, chasting Sex, and Age, the experimental variables Trial strength of the co-occurrence of antonyms and 2.5 Variant Domain Dependent Co- number and Reaction time in previous trial, (Pre- synonyms is a function of the domains. For in- occurrence Extraction 2.2 Results ceding RT, see Baayen and Milin 2010 for a dis- stance, the antonyms: slow: fast, slow: quick and In the previous algorithm, the co-occurrence in- slow: rapid were used in completely different Figure1 displays the density estimate for the dis- cussion), phonological variables (Metrical struc- formation was extracted from the same sentence. domains with little or no overlap. Slow: fast is tribution of reaction times. The solid and the ture of base, residualized Number of phonems), However, unlike the antonyms, synonyms rarely used in the domains of motion, movement, dashed lines correspond to the results for synthetic and the lexical variables Number of phonological occurred together in the same context (the same speed; slow: quick is used for time, march, steps and analytic stimuli, respectively. neighbours, Mean RT of base adjective, residual- sentence and domain). It is natural to assume that domains. The synonyms powerful: strong are ized Phonological Levenshtein distance (PLD20, in most cases synonyms are used in different used in the domains of voices, links, meaning; Synthetic all three from Balota et al. 2007), Age of acqui- contexts since they evoke similar but not identic- Analytic

strong: muscular in the domains of legs, neck; 0.0020 sition (from Kuperman et al. 2012), Frequencies al meanings. This is however not the case for strong: heavy are used in the domains of wind of base, Analytic comparative, Synthetic com- antonyms, which were always used to evoke rain, waves rainfall, winds snow respectively; parative (from COCA), Inflectional entropy (cf. Density properties of the same meanings when these an- intense: strong in the domains of battle resis- 0.0010 Moscoso del Prado Mart´ın et al. 2004). With the tonymic words were used to express opposition tance, radiation gravity, updrafts clouds respec- exception of the three Subject predictors, the ini- (Paradis & Willners 2011), and in fact also when tively.

0.0000 tial model contained interactions between Class they are not used to express opposition (Para- We observed some unique patterns among the 1000 1500 2000 2500 and the other predictors. Finally, random inter- dis,et al., 2015). Because of this we decided to antonyms and synonyms as described below: cepts were included for the factors Subject and extract a variant domain dependent co- The antonyms: Reaction times times in ms occurrence algorithm for the synonyms and an-  Co-occurred frequently in the same do- Adjective base. tonyms, which instead extracts patterns of co- main in the same sentence. Figure 1: Density estimate of reaction times in After removal of insignificant predictors, the fi- occurrence information of the synonyms and an- perception experiment nal model reports significant interactions between

152 33 stimulus Class and Preceding RT, PLD20, Number 3.2 Reaction times instead of using the linear ordering of the words  Start with the selected set of syno- of phonemes, Synthetic frequency, and Analytic In order to be able to investigate the effect of the in the text, it generates co-occurrences frequen- nym/antonym pairs frequency. Figure2 displays the partial effects processing complexity of the base adjective on the cies along paths in the dependency tree of the  Extract sentences containing the pairs for these interactions. The vertical axis shows the preferred comparative variant, the same 41 speak- sentence as presented in the sections 2.2–2.5.  Identify the dependency information of transformed reaction times; higher values corre- ers first participated in a visual lexical decision the sentences spond to longer reaction times. 2.1 Training and testing data task that gathered reaction times for the 60 target  Mine the dependency patterns linking In agreement with figure 1, the partial effects adjectives, as well as 150 other existing and non- The antonyms and synonyms employed for train- the pairs with the concepts they modify reveal significantly lower estimates for the syn- existing distractor items. The participants were ing and testing were extracted from the data used  Use these learned patters to extract fur- thetic stimuli (solid lines) than for the analytic not informed about the purpose of this task, and by Paradis et al. (2009) where the antonyms are ther relations (synonym/antonym pairs stimuli (dashed lines). This is true even in the presented according to their underlying dimen- there were at least 14 days for each participant be- and the associated concepts) most adverse conditions (e.g. in cases in which sions and synonyms were provided for all the tween the lexical decision task and the production the synthetic comparative of a comparative is at- individual antonyms (for a description of the 2.3 Extracting the domains experiment. The reaction times obtained in this tested only very rarely in a linguistic corpus, left principles see Paradis et al. 2009). That set of task were pooled for each adjective, and the me- We created a matrix of antonym and synonym edge of lower right panel in figure2). antonyms and synonyms were used to extract pairs matching every antonym and synonym dian was calculated. their co-occurrence patterns from the Wikipedia from the list in Table 1. Using the patterns 3 Comparative variation in production texts in this study. learned in section 2.2 we identified as many do- 3.3 Results mains as possible for the pairs of synonyms and 3.1 Method For most of the adjectives, the completion task Dimen- Anto- The associated syn- antonyms and calculated their frequency of co- 41 native speakers of Canadian English partici- was successful in obtaining comparative responses sions nyms onyms of the antonyms occurrence in the respective domains. pated individually in a spoken sentence comple- from the 41 speakers. However, two participants Size Large huge, vast, massive ,big When the lexical concepts were considered tion task. The task used the same set of 60 ad- produced hardly any comparative in the task, and ,bulky, giant ,gross, too specific, we referred them to more inclusive, jectives as in the perception experiment above, but were therefore excluded from the data set. 6 out heavy, significant ,wide superordinate domains. Frequency of occurrence none of the participants in the production exper- of the 60 adjectives were excluded because the Small little, low, minor, minute, was used as a criterion for conflation of concepts iment had also participated in the previous task. responses contained almost exclusively synthetic petite, slim, tiny into superordinate ones as follows.  Extract term co-occurrence frequencies Participants were first shown a context sentence or analytic comparatives, or because the context Speed Fast quick, hurried, prompt, within a window of sentences constitut- containing the adjective in the positive. After a key sentence did not elicit a considerable number of accelerating, rapid press, an incomplete target sentence containing a Slow sudden, dull, gradual, lazy ing both the antonyms/synonyms and the comparative responses. 747 out of the remain- potential domain concepts. For instance: blank and one or more target words appeared also ing 39 54 = 2106 responses contained a syn- Strength Strong forceful, hard, heavy, o Antonyms: cold: hot, domain on the screen. The participants were instructed to × thetic comparative (35 %), 843 contained an ana- muscular, powerful, sub- concepts: winter, summer use the target words to fill the blank in the sen- lytic comparative (40 %). The remaining 516 re- stantial, tough o Synonyms: strong: heavy, do- tence. If necessary, they could also use additional sponses (25 %) did not contain a comparative con- Weak light, soft, thin, wimpy main concepts: wind, rain words to complete the sentence. The sentences struction, and were discarded. There was notable Merit Bad crappy, defective, evil  Create a matrix of the potential domain were constructed in such a way that a comparative variation between the two variants both across and ,harmful, poor ,shitty concepts and the co-occurring terms with construction was the most likely target for comple- within items, which indicates that English compar- ,spoiled ,unhappy their frequencies tion, but participants were not explicitly instructed ative variation is indeed a highly non-deterministic Good awful ,genuine ,great, ho-  Cluster them using the k-means algo- to use comparatives. The structure of the incom- field that is apparently affected by both speaker- norable ,hot, neat, nice, rithm plete sentences was the same in all trials. The dependent and adjective-dependent factors. reputable, right ,safe ,well  Take the term with the maximal frequen- subject was a simple noun phrase, followed by a Logistic general additive mixed-effects models cy (centroid) in each cluster and consider copula verb. The blank to be filled followed in (cf. Wood 2006) were used to investigate the re- Table 1. The antonym pairs in their meaning dimen- it the domain term predicative position. This design ensured that the lation between the median RTs and the individual sions and the associated synonyms.  Test the result using expert judgment context-dependent factors reported in the literature running the algorithm on the test set. responses. These models have the advantage of re- 2.2 Extracting the co-occurrences of the such as the increased probability of analytic com- vealing statistically significant effects of the inde- antonyms and synonyms in the respec- paratives in predicative position were held con- pendent variable on the dependent even if the rela- Words co-

tive domains o-

stant for all adjectives. Example (3) shows the ex- D

tion between them is not a linear one. For instance, occurring onym

In order to extract the co-occurrences of the an- n

perimental trial for the target adjective wealthy. there could a threshold in the reaction times up to with possible

cept

tonyms/synonyms in the respective domains we Sy / which speakers strongly prefer the synthetic com- n domain con- (2) The duke is wealthy. produced the relational information among the parative, but beyond which they shift to analytic cepts Yet, the king is . constituent words of a given sentence. To this co n

comparatives in a nearly categorical way. In such end, we extracted the patterns linking the syn- requency

F Potential mai WEALTHY a case, a linear model might fail to detect this non- onyms/antonyms and the concepts they modify Antonym The experiment also contained 105 distractor linear effect of RTs on the responses. and used this same pattern to extract more lexical hot summer win- temperature 50 trials that had a similar structure, but which did Two models were fitted: a null model which concepts. The procedure was as follows. cold ter climate 43 not contain adjectives as the target words. contained only a random effect for speaker, and Wind 30

34 151 a model with an additional smooth term for the 4 Discussion and conclusion On the use of antonyms and synonyms from a domain perspective effect of the median RTs. If processing complex- ity has a notable effect on speaker responses, the The results from the first experiment show that Debela Tesfaye Carita Paradis smooth term should turn out to be statistically sig- synthetic comparatives have a clear perceptual IT PhD Program Centre for Languages and Literature nificant, and the predictive accuracy of the model processing advantage over the analytic correspon- Addis Ababa University Lund University should improve by the addition of the term. As dents. Even in conditions in which the morpho- Addis Ababa, Ethiopia Lund, Sweden table 1 shows, this is indeed the case. While the logical form is particularly difficult to process, the null model has a total predictive accuracy of about average reaction time is still faster than that for the phrasal variants. This finding makes it rather dabo [email protected] [email protected] lu.se 69 %, the addition of the smooth term for median RTs increases the accuracy by 5.6 %. There is a unlikely that the use of analytic comparatives in larger increase of predictive accuracy for analytic cognitively demanding environments benefits the The rationale is that the dependency parsing pro- responses than for synthetic responses (7.1 % vs. listener. Yet, the findings from the production ex- Abstract duces the relational information among the con- 3.9 %). periment reveal a significant relation between the stituent words of a given sentence, which allows This corpus study addresses the question selected comparative form and the processing dif- us to (i) extract co-occurrences specific to a giv- of the nature and the structure of anto- Synthetic Analytic Total ficulty of the adjective in question. For cognitively en domain/context, and (ii) capture long distance nymy and synonymy in language use, Null model 515 580 1095 more complex adjectives which take longer to pro- co-occurrences between the word pairs. Consider following automatic methods to identify (68.9%) (68.8%) (68.9%) cess, the analytic comparative is preferred, sug- (1). their behavioral patterns in texts. We ex- Model with 544 640 1184 gesting that speakers resort to the phrasal alterna- 1. Winters are cold and dry, summers are amine the conceptual closeness/distance tive if processing demands are relatively high. cool in the hills and quite hot in the plains. smooth term (72.8%) (75.9%) (74.5%) of synonyms and antonyms through the In (1), the antonyms cold: hot modify winters One aspect to keep in mind is that lexical de- lens of their DOMAIN instantiations. and summers respectively. Those forms express Table 1: Correctly predicted responses in the sen- cision tasks like those used above to collect reac- tence completion task. tion times have a strong focus on form process- 1 Introduction the lexical concepts winter and summer in the domain temperature. The antonyms cold: hot co- ing, while they are less informative about func- Using data from Wikipedia, this corpus study occur but at a distance in the sentence. Thanks to Figure3 illustrates the contribution of the tional processing (see Yap et al. 2011 for a dis- addresses the question of the nature and the the dependency information, it is possible to ex- smooth term to the model. The vertical position cussion). Even if the perception experiment has structure of antonym and synonymy in language tract such long distance co-occurrences together of the regression line indicates the predicted prob- shown that the analytic form is more difficult to use. While quite a lot of empirical research using with the concepts modified. ability of analytic responses for the median RTs process for listeners, the higher explicitness of the different observational techniques has been car- The article is organized as follows. In section shown on the horizontal axis. The shaded area more comparative may still make the comparative ried on antonymy (e.g. Roehm et al. 2007, Loba- 2, we describe the procedure and the two me- indicates the 95 % confidence band. As the fig- function more accessible for listeners than the - nova 2013, Paradis et al. 2009, Jones et al. 2012), thods used: co-occurrence extraction of lexical ure shows, the relation between processing com- er comparative, which is also suggested by Mon- not as much has been devoted to synonymy (e.g. items in the same sentence and a variant domain plexity and comparative preference is indeed non- dorf (2009, 6). The experiments reported here do Divjak 2010) and very little has been carried out dependent co-occurrence extraction method. The linear: speakers strongly prefer the synthetic com- not address this issue of the comparative alterna- on both of them using the same methodologies latter method extracts patterns of co-occurrence tion, but looking at functional accessibility offers (Gries & Otani 2010). The goal of this study is to information of the synonyms and antonyms in parative for adjectives with very low RTs, but tend bring antonyms and synonyms together, using different sentences. In section 3 we present the to favor the analytic comparative for adjectives a promising venue of future research. the same automatic methods to identify their be- results and discussions followed by a discussion with RTs larger than 600 ms. In sum, the produc- To conclude, the results imply that speakers and havioral patterns in texts. We examine the con- of our results in comparison with related pre- tion experiment shows that the processing com- listeners process analytic and synthetic compar- ceptual closeness/distance of synonyms and an- vious works in section 4. The conclusions are plexity of the base adjective has an effect on the atives differently: while the morphological form tonyms through the lens of their domain instan- presented in section 5. preference of analytic comparatives by speakers. is easier to process for listeners, the phrasal form tiations. For instance, strong used in the context has benefits for the speaker. More generally, these of wind or taste (of tea) as compared to light and 2 Procedure 1.0 findings also contribute toward our understanding weak respectively, and light as compared to 0.8 Using an algorithm similar to the one proposed of morphosyntactic exponence. It is frequently ar- heavy when talking about rain or weight. by Tesfaye & Zock (2012) and Zock & Tesfaye 0.6 gued (e.g. in McWhorter 2001) that analytic forms The basic assumption underlying this study is (2012), we extracted the co-occurrence informa- 0.4 are less complex than synthetic forms, with conse-

that the strength of co-occurrence of antonyms P (analytic) tion of the pairs in different domains separately, 0.2 quences for fields such as the structure of contact and synonyms is dependent on the domain in measuring the strength of their relation in the languages or the diachronic development of a lan- which they are instantiated and co-occur. In or- 0.0 different domains with the aim of (i) making guage. This paper is one of the few that explicitly der to test the hypothesis we mine the co- principled comparisons between antonyms and 500 600 700 800 900 occurrence information of the antonyms and the address the processing efficiency of grammatical synonyms from a domain perspective, and (ii) Median reaction time (ms) synonyms relative to the domains using a depen- variants where one form is morphological and the determining the structure of antonymy and syn- dency grammar method. 1 other syntactic in nature. The findings suggest that onymy as categories in language and cognition. Figure 3: Effect of median reaction time on the the discussion of the alledged complexity of syn- Our algorithm is similar to the standard n- probability of analytic responses. 1 thetic forms may also need to take into account http://nlp.stanford.edu/software/lexparser.shtml gram co-occurrences extraction algorithms, but different demands of speakers and listeners.

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

150 35 Acknowledgments Benedikt Szmrecsanyi. 2005. Language users as crea- Appendix 1: A SYMPAThy-based view of the network of Cxns with the verb gettare tures of habit: A corpus-based analysis of persis- This work was supported by the Deutsche For- tence in spoken English. Corpus Linguistics and ...... schungsgemeinschaft (grant KU 2896/1-1). I wish Linguistic Theory, 1(1):113–150. TL = GETTARE ‘THROW’ Frame2 (subj#obj) Cxn form [[SUBJ] gettare [OBJ] ] NP NP to thank Ben Tucker (University of Alberta, Ed- SUBJ: Person, Animal, ... Simon N. Wood. 2006. Generalized Additive Mod- OBJ: Substance, Artifact, ... monton) for making available to me the facilities els. An introduction with R. Chapman & Hall/CRC, meaning [CAUSE (OBJ, [GO (AWAY)])] ...... Boca Raton, FL. Frame1 (subj#obj#comp-su) Cxn of the Alberta Phonetics Laboratory for the exper- ...... form [[SUBJ]NP gettare [OBJ]NP su [COMP]NP] Frame3 (subj#obj#comp-in) Cxn iments reported in this paper. SUBJ: Person, Event,... form [[SUBJ] gettare [OBJ] in [COMP] ] NP NP NP Melvin J. Yap, Sarah E. Tan, Penny M. Pexman, and OBJ: Substance, Natural_Phenomenon, ... SUBJ: Event, Act, ... Ian S. Hargreaves. 2011. Is more always bet- COMP: Artifact, Substance, ... OBJ: Natural_Object, Substance, ... COMP: Feeling, State, ... meaning [CAUSE (OBJ, [GO (OBJ, [TO ([ON (COMP)])])])] meaning [CAUSE (OBJ, [GO (OBJ, [TO ([IN (COMP)])])])] ter? effects of semantic richness on lexical decision, II ......

References speeded pronunciation, and semantic classification. instantiation links II ( ) Psychonomic Bulletin & Review, 18(4):742–750. gettare#fango##comp-su Cxn R. Harald Baayen and Petar Milin. 2010. Analyzing gettare#acqua#sul#fuoco Cxn form [[SUBJ] gettare (ADV) (ADJ) fango su [COMP] ] I NP NP I SUBJ: Person, Event,... reaction times. International Journal of Psycholog- form [[SUBJ] gettare (ADV) (ADJ) acqua sul fuoco] NP SUBJ: Person, Event,... OBJ: fango (⇒ SG; bare | partitive) ical Research, 3(2):12–28. II OBJ: acqua COMP: Person, Institution, ... COMP: fuoco meaning ‘defame, discredit, blacken the name of’ David A. Balota, Melvin J. Yap, Michael J. Cortese, SU: sul meaning ‘defuse, minimize a situation’ Keith A. Hutchison, Brett Kessler, Bjorn Loftis, gettare#ombra#comp-su Cxn

James H. Neely, Douglas L. Nelson, Greg B. Simp- form [[SUBJ] gettare (ADV) [ombra] su [COMP] ] gettare#benzina#sul#fuoco Cxn NP NP NP son, and Rebecca Treiman. 2007. The En- SUBJ: Person, Event,... form [[SUBJ] gettare (ADV) (ADJ) benzina sul fuoco] NP OBJ: ombra (⇒ full NP) glish Lexicon Project. Behavior Research Methods, I SUBJ: Person, Event,... COMP: Person, Institution, ... I OBJ: benzina 39(3):445–459. I meaning ‘cast a shadow’ I COMP: fuoco SU: sul Jeremy Boyd. 2007. Comparatively speaking. A psy- meaning ‘add fuel to the fire’ II ... Questo getta una pesantissima ombra sulla legittimità ...... rischia di gettare ulteriore fango sul calcio ... cholinguistic study of optionality in grammar. Ph.D. ‘This casts a serious shadow on the legitimacy...’ ... la società getta acqua sul fuoco ... I ‘(it) may sully football even more’ I ... Il rivale getta ombra sulla salute del leader ... thesis, University of California, San Diego. ‘the company defuses (the situation)’ ... Hanno sempre gettato fango su di noi ...... lei sta gettando benzina sul fuoco ... ‘His opponent casts a shadow on the leader’s health’ ... getta abbondante acqua sul fuoco ... ‘They have always sullied us’ ‘she is adding fuel to the fire’ ‘(it) minimizes (the situation) greatly’ ... Evitiamo di gettare altra benzina sul fuoco ...... Gli amici hanno gettato sulla bara garofani rossi ... ‘Friends threw red carnations on his coffin’ George E. P. Box and David R. Cox. 1964. An analysis ‘Let’s not add fuel to the fire’ ... getta un sasso sull’ autostrada ... ‘(s/he) throws a stone in the highway’ of transformations. Journal of the Royal Statistical Society. Series B, 26(2):211–252. The verb gettare ‘to throw’ combines with the highly schematic subj#obj#comp-su Cxn, whose slots Mark Davies. 2008–. The Corpus of Con- can freely vary with respect to linear order, presence of determiners, modifiers, etc. A semi-productive temporary American English (COCA): 450 mil- instance of this construction is the subj#obj:ombra#comp-su Cxn, with a fixed object slot and a partially lion words, 1990-present. Available online at http://corpus.byu.edu/coca/. variable oblique slot, which can appear with a semantically limited range of arguments. A fully lexically specified instance of the same construction is instead the subj#obj:acqua#comp-su:sul-fuoco Cxn, which Martin Hilpert. 2008. The English comparative. lan- has both slots instantiated and limited degree of variability. guage structure and language use. English Lan- guage and Linguistics, 12(3):395–417.

Victor Kuperman, Hans Stadthagen-Gonzalez, and Appendix 2: List of idioms used as experimental stimuli Marc Brysbaert. 2012. Age-of-acquisition ratings for 30,000 English words. Behavior Research Meth- Gettare la maschera (‘to reveal oneself ’) Mettere i puntini sulle i (‘to be nitpicking’) ods, 44(4):978–990. Gettare la spugna (‘to give up’) Mettere zizzania (‘to sow discord’) John H. McWhorter. 2001. The world’s simplest Gettare acqua sul fuoco (‘to defuse a situation’) Perdere la testa (‘to lose one’s head’) grammars are creole grammars. Linguistic Typol- Gettare olio sul fuoco (‘to inflame a situation’) Perdere il treno (‘to miss an opportunity’) ogy, 5:125–166. Mettere la mano sul fuoco (‘to stake one’s life on Perdere il filo (‘to lose the thread’) sth’) Perdere la bussola (‘to lose one’s bearings’) Britta Mondorf. 2009. More support for more-support. John Benjamins, Amsterdam. Mettere il carro davanti ai buoi (‘to put the cart Prendere il toro per le corna (‘to take the bull by before the horse’) the horns’) Britta Mondorf. 2014. Apparently competing Mettere le carte in tavola (‘to lay one’s cards on Prendere una cotta (‘to get a crush on somebody’) motivations in morpho-syntactic variation. In the table’) Prendere un granchio (‘to make a blunder’) Brian MacWhinney, Andrej Malchukov, and Edith Moravcsik, editors, Competing motivations in gram- Mettersi il cuore in pace (‘to resign oneself to sth’) Tirare i remi in barca (‘to rest on one’s oars’) mar and usage, pages 209–228. Oxford University Mettere nero su bianco (‘to put sth down in black Tirare la cinghia (‘to tighten one’s belt’) Press, Oxford. and white’) Tirare le cuoia (‘to die’) Mettere il dito sulla piaga (‘to hit someone where Tirare la corda (‘to take sth too far’) Ferm´ın Moscoso del Prado Mart´ın, Aleksandar Kostic,´ and R. Harald Baayen. 2004. Putting the bits it hurts’) together. An information theoretical perspective on morphological processing. Cognition, 94(1):1–18.

36 149 Acknowledgments [Hoffmann and Trousdale2013] Thomas Hoffmann and Graeme Trousdale, editors. 2013. The Oxford Lexical emergentism and the “frequency-by-regularity” interaction This research was carried out within the CombiNet Handbook of Construction Grammar. Oxford Uni- versity Press, Oxford. project (PRIN 2010-2011 Word Combinations in Claudia Marzi Marcello Ferro Vito Pirrelli Italian: theoretical and descriptive analysis, com- [Lenci2014] Alessandro Lenci. 2014. Carving verb Institute for Computational Linguistics - National Research Council - Pisa putational models, lexicographic layout and cre- classes from corpora. In Raffaele Simone and {claudia.marzi,marcello.ferro,vito.pirrelli}@ilc.cnr.it ation of a dictionary, n. 20105B3HE8) funded by Francesca Masini, editors, Word Classes. Nature, ty- the Italian Ministry of Education, University and pology and representations, Current Issues in Lin- guistic Theory, pages 17–36. John Benjamins. assume that accessing a word in some way affects Research (MIUR). Abstract the access representation of that word (e.g. Foster, [Nissim and Zaninello2011] Malvina Nissim and An- 1976; Marslen-Wilson, 1993; Sandra, 1994). In spite of considerable converging drea Zaninello. 2011. A quantitative study on In spite of such a wealth of converging evidence of the role of inflectional References the morphology of Italian multiword expressions. evidence, however, little efforts have been put so Lingue e Linguaggio, X:283–300. paradigms in word acquisition and far into providing detailed, algorithmic models of [Attardi and Dell’Orletta2009] Giuseppe Attardi and processing, little efforts have been put so the interaction between word frequency, Felice Dell’Orletta. 2009. Reverse revision and [Sag et al.2002] Ivan A. Sag, Timothy Baldwin, Fran- far into providing detailed, algorithmic cis Bond, Ann Copestake, and D. Flickinger. 2002. paradigm frequency, paradigm regularity and linear tree combination for dependency parsing. In models of the interaction between lexical Proceedings of NAACL 2009, pages 261–264. Multiword expressions: A pain in the neck for NLP. lexical familiarity in word acquisition and token frequency, paradigm frequency, In Proceedings of CICLing 2002, pages 1–15. processing. We offer here such an algorithmic paradigm regularity. We propose a neuro- [Baroni et al.2004] Marco Baroni, Silvia Bernardini, account, and discuss some theoretical [Shannon1948] Claude E. Shannon. 1948. A mathe- computational account of this interaction, Federica Comastri, Lorenzo Piccioni, Alessandra implications on the basis of computational Volpi, Guy Aston, and Marco Mazzoleni. 2004. In- matical theory of communication. The Bell System and discuss some theoretical implications Technical Journal, 27(3):379 – 423. simulations. troducing the La Repubblica Corpus: A Large, An- of preliminary experimental results. notated, TEI(XML)-Compliant Corpus of Newspa- [Squillante2014] Luigi Squillante. 2014. Towards an 2 The computational model per Italian. In Proceedings of LREC 2004, pages empirical subcategorization of multiword expres- 1 Introduction 1771–1774. sions. In Proceedings of the 10th Workshop on Mul- In the present contribution, we use Temporal Self- Over the last fifteen years, growing evidence has tiword Expressions (MWE), pages 77–81, Gothen- organising Maps (TSOMs) to simulate dynamic [Calzolari et al.2002] Nicoletta Calzolari, Charles J. burg, Sweden, April. Association for Computational accrued of the role of morphological paradigms in effects of lexical storage, organisation and Fillmore, Ralph Grishman, Nancy Ide, Alessandro Linguistics. the developmental course of word acquisition. Lenci, Catherine MacLeod, and Antonio Zampolli. competition. Children have been shown to be sensitive to sub- 2002. Towards best practice for multiword expres- [Tabossi et al.2011] Patrizia Tabossi, Lisa Arduino, and sions in computational lexicons. In Proceedings of Rachele Fanari. 2011. Descriptive norms for 245 regularities holding among paradigm cells (see, LREC 2002, pages 1934–1940. Italian idiomatic expressions. Behavior Research among others, Orsolini et al., 1998; Laudanna et Methods, 43:110–123. al., 2004 on Italian; Dabrowska, 2004, 2005 on [Dell’Orletta2009] Felice Dell’Orletta. 2009. Ensem- Polish; and Labelle and Morris, 2011 on French). ble system for Part-of-Speech tagging. In Proceed- [Wulff2008] Stefanie Wulff. 2008. Rethinking Id- In line with this evidence, and contrary to both ings of EVALITA 2009. iomaticity: A Usage-based Approach. Continuum. rule-based (e.g. Pinker and Ullman, 2002; [Evert and Krenn2005] Stefan Evert and Brigitte [Wulff2009] Stefanie Wulff. 2009. Converging evi- Albright, 2002) and connectionist approaches to Krenn. 2005. Using small random samples dence from corpus and experimental data to cap- word acquisition (Rumelhart and McClelland, for the manual evaluation of statistical associa- ture idiomaticity. Corpus Linguistics and Linguistic 1986), no unique paradigm cell can be identified tion measures. Computer Speech & Language, Theory, 5(1):131–159. as the base source of all inflected forms produced 19(4):450–466. Special issue on Multiword by the speaker, but the structure of the entire Expression. [Zeldes2013] Amir Zeldes. 2013. Productive argument Figure 1. An integrated activation pattern for the input selection: Is lexical semantics enough? Corpus Lin- paradigm is understood to play a fundamental role string “#pop$”. Note that two distinct, but topologically guistics and Linguistic Theory [Fillmore et al.1988] Charles J. Fillmore, Paul Kay, and , 9(2):263–291. in both word acquisition and processing. neighbouring nodes respond to the two p’s in pop , bearing Mary Catherine O’Connor. 1988. Regularity and Such evidence supports a view of the mental witness to the process of selective sensitivity to time-bound instances of the same symbol type. For simplicity, only the idiomaticity in grammatical constructions: the case lexicon as an emergent integrative system, nodes that are most highly activated by each input symbol of let alone. Language, 64(3):501–538. whereby words are concurrently, redundantly and are shaded and tagged with that symbol. competitively stored (Alegre and Gordon, 1999; [Goldberg1995] Adele Goldberg. 1995. Construc- Baayen et al., 2007). The view assumes that all TSOMs, a variant of classical Kohonen’s SOMs tions. A Construction Grammar Approach to Argu- word forms are memorised in the lexicon, thus ment Structures. The University of Chicago Press, (Kohonen, 2001), are dynamic memories that are Chicago. making no distinction between regular and trained to store and classify time-series of irregular inflected forms, or between uniquely symbols through patterns of activation of fully [Goldberg2006] Adele Goldberg. 2006. Constructions stored bases and all other non-base forms interconnected nodes (Koutnik, 2007; Ferro et al., at work. Oxford University Press, Oxford. produced by the speaker on demand (see Baayen, 2010; Pirrelli et al., 2011; Marzi et al., 2012). Map 2007; Marzi, 2014; for a recent overview). In nodes mimic neural clusters, with inter-node [Gries2008] Stefan Th. Gries. 2008. Phraseology addition, to capture the fact that words connections representing neuron synapses whose and linguistic theory: a brief survey. In Sylviane Granger and Fanny Meunier, editors, Phraseology: encountered frequently exhibit different lexical weights determine the amount of influence that an interdisciplinary perspective, pages 3–25. John properties from words encountered relatively the activation of one node has on another node Benjamins, Amsterdam & Philadelphia. infrequently, any model of lexical access must (Fig. 1). Each map node receives input

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage . Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

148 37 by Tabossi et al. (2011), and tested to what degree connections from an input layer where individual of paradigm members, we can investigate the Combination r symbols making up a word are presented one at a relative contribution of input factors to the timing the speaker-elicited flexibility judgments available time, in their order of appearance. Input and pace of lexical acquisition and suggest an in this repository can be modeled by a composition SUM .44 of our variability indexes. connections thus convey information of the explanatory account of their interaction. AVERAGE .44 current input stimulus to map nodes. Hebbian 4.1 The descriptive norms by Tabossi et al. AVERAGE .46 connections, on the other hand, are strengthened 3 Experimental evidence POS each time two nodes are activated at consecutive Tabossi et al. (2011) collected several normative MAX .47 Fifty German and fifty Italian verb time ticks, conveying the probabilistic measures for 245 Italian verbal idiomatic expres- (sub)paradigms were selected among the most expectation that one node will be activated soon sions. Using a group of 740 Italian speakers, they highly ranked paradigms by cumulative frequency Table 1: Pearson’s Correlation strength between after another node is activated. in a reference corpus (CELEX Lexical database collected a minimum of 40 elicited judgments for different combination methods of the SYMPAThy- When a symbol is shown on the input layer at for German, Baayen et al., 1995; Paisà Corpus for each idiom on several psycholinguically relevant based fixedness indexes and the syntactic flexibil- a certain time tick, all map nodes are fired Italian, Lyding et al., 2014). For each paradigm, variables. ity judgments in Tabossi et al. (2011). All reported synchronously, their overall pattern of activation an identical set of 15 cells was used for training, Among the different kinds of ratings, those con- values are associated with p < .05, N = 23. representing the processing response of a TSOM for an overall number of 750 inflected forms for cerning syntactic flexibility have been collected to the symbol at that time tick. Due to principles each language. Each data set was administered to by inserting each idiomatic expression in a sen- of topological organisation of map’s responses, the map for 100 epochs under two different tence in which one of the following five syntactic similar input stimuli (i.e. two instances of the flexibility ratings in Tabossi et al. (2011). Corre- training regimes: a uniform distribution (UD: 5 same symbol in different contexts) tend to be modifications occurred: adverb insertion, adjec- lation values are reported in Table 1. In all cases, tokens per word), and a function of real word associated with largely overlapping memory tive insertion, left dislocation, passive and move- there is a significant (p < .05) positive correlation, frequency distributions in the reference corpus traces (e.g. the two p nodes activated by pop in ment. Participants were asked to evaluate, on a (SD: tokens are in the range of 1 to 1000). By ranging between .44 and .47, thus supporting the Fig. 1). During training, nodes get gradually 7-point scale, how much the meaning of the id- varying frequency and comparing the inflectional psycholinguistic plausibility of our corpus-based specialised to respond most strongly to specific iomatic expression in the syntactically modified complexity of training data across the two variability indexes. time-bound instantiations of symbols, while sentence was similar to its unmarked meaning as experiments, we expected to gain some insights remaining relatively inactive in the presence of expressed in a paraphrase prepared by the authors. These results, albeit preliminary, look promis- into the interplay between morphological other stimuli. A recurrent activation pattern ing especially given the different nature of the regularity (defined by levels of predictability in associated with an input symbol occurring in a 4.2 Data extraction behavioral and corpus-based indexes. On the stem and ending allomorphy of training data in the specific context can thus be seen as the map’s one hand, the speakers’ ratings are semantically two languages) and word frequency in word Out the 245 expressions in Tabossi et al., we se- memory trace for that symbol in that context. driven, since they are thought to model how much acquisition. After training, we monitored the lected the 23 target idioms reported in Appendix 2. An input word is administered to a TSOM as the figurative meaning of a given idiom is sensitive behaviour of the four resulting TSOMs (namely Each such idiom can be represented, in our ap- a time series of symbols, i.e. a sequence of letters to its syntactic form. On the other hand, the auto- UD Italian, SD Italian, UD German and SD proach, as a fully lexically specified transitive Cxn or sounds presented on the input layer one at a German) by controlling the time of acquisition of headed by a given verbal TL, for which the subject matically corpus-derived information exploited by time. The map’s response to a word stimulus is the individual words, the time of acquisition of entire slot is underspecified (e.g. gettare#obj:maschera). our indexes does not take meaning into account. overall activation pattern obtained through paradigms, and their acquisitional time span. For SUch indexes describe a lexically specified Cxn integration of the activation patterns triggered by We built the variational profiles of our target id- our present purposes, we define the time of that can in principle have an idiomatic as well as the individual symbols making up the word (see ioms by adopting an adapted version of the proce- acquisition of a single word as the training epoch a compositional, literal meaning (even if, presum- Fig. 1 for a simplified example with the word dure described in Section 3: whence a TSOM can accurately recall the word in ably, the latter case is rare in the corpus). pop ). Accordingly, if two input strings present question from its memory trace. Recall is a 1. for each TL, we extracted the SYMPAThy pat- some symbols in common (e.g. pop and cop , write difficult task that requires that the map has terns from the “la Repubblica” corpus; and written ), they will tend to activate largely developed a clear notion of how to unfold a 2. the patterns involving one of our target idioms overlapping patterns of strongly responsive 5 Conclusion synchronous activation pattern (the word’s nodes. Like in the case of individual symbols, the were identified and selected; memory trace) into a sequence of nodes integrated activation pattern for an input word is, 3. for each idiom, the variability indexes de- representing the correct letters making up the In this study we presented a procedure for char- at the same time, the systematic processing scribed in Section 3.2 were calculated. Note word, in the appropriate order. Likewise, for each acterizing the combinatorial potential of a lexical response of the map to an input stimulus, and the that, given the nature of our experimental stim- paradigm, its time of acquisition by a map is the item and the degree of fixedness of the Cxns it oc- word’s memorised representation (or memory mean acquisition epoch of all forms belonging to uli, the lexical variability index is not relevant; curs in. Such a procedure has been preliminary trace) in the map. the paradigm. 4. we built a fixedness index for each idiom, ac- tested on a small sample of idiomatic expressions To investigate issues of “frequency-by- As a general trend, TSOMs acquire word cording to the four composition methods in the and the resulting representation has been evaluated regularity” interaction (Ellis and Smith, 1998), we forms by token frequency, with higher-frequency compared two sets of parallel experiments carried previous section. against the subject-elicited judgments collected by words being successfully recalled at earlier out on German verb paradigms (Marzi et al., Tabossi et al. (2011). In the future, we are plan- learning epochs. However, when it comes to the 4.3 Results and discussion 2014) and Italian verb paradigms. By keeping ning to extend the inventory of variability dimen- actual timing of paradigm acquisition, things get constant some input conditions, such as selection In order to test the cognitive plausibility of the sions (addressing also the question of the semantic considerably more complex, with the notion of of paradigm cells and degrees of morphological fixedness indexes extracted from SYMPAThy, we compositionality of Cxns), to study their relative morphological regularity interacting non-trivially redundancy within training paradigms, while calculated the Pearson’s Product-Moment Corre- weight and their interactions, and to develop more with token frequency distributions. In fact, in both varying others, such as the frequency distribution lation strength between them and the syntactic sophisticated ways to combine them.

38 147 morphological variability of the constructions’ LEXICAL VARIABILITY. The entropy of the lex- German and Italian, the vast majority of predictable stem allomorphy due to a limited components; iii) the variability with respect to ical instantiation of the slot positions of a Frame paradigms are acquired earlier (p<.005) in a UD number of alternants, show a correlation between determiners; iv) the variability with respect to is calculated by assuming that the states x of regime than in an SD regime (Fig. 2). stem cumulative frequency and acquisition time adjectival and adverbial modifications; v) the the random variable X are all the possible fillers (r=-.24 p<.00001). variability in the linear order. that can instantiate a given slot in Cxn (e.g. in Conversely, in Italian, where verb subj#gettare#obj:luce#su X, X can be filled by vi- 4. variational profiles are then used to measure the conjugation exhibits more extensive and less lexical, morphological and syntactic degrees of cenda ‘matter’, mistero ‘mystery’, etc.). predictable patterns of allomorphy than in German (Pirrelli, 2000), acquisition of irregular freedom of Cxns, providing a multidimensional MORPHOLOGICAL VARIABILITY. It is cal- paradigms does not appear to benefit from stem quantitative characterization of their level of culated as the entropy of the morphological cumulative token frequencies (r=.01, p>.5). This fixedness. features manifested by the fillers of a Cxn suggests that extensive allomorphy in a paradigm (e.g., gettare#ombra-fs ‘cast shadow-singular’; tends to minimise the influence of cumulative 3.2 Entropy-based Cxn fixedness modeling gettare#ombra-fp ‘cast shadow-plural’). frequency on its acquisition, and isolated forms In what follows, we devise a way to encode the ARTICLES VARIABILITY. This index encodes can only take advantage of their own token variation possibilities shown by Cxns, as well as how variable is the presence or absence of articles frequency, while taking no advantage of the a meaningful way to combine them. Specifically, determining the available slots in a Cxn, and, if frequency boost provided by other cells of the we distinguish a series of dimensions of variation appropriate, their type (DEFinite vs. INDefinite): same paradigm. As a result, Italian irregular and propose to exploit Entropy (Shannon, 1948) paradigms are acquired significantly (p<.005) for instance, gettare#∅+acqua#su DEF+fuoco. to measure how fixed is the behavior of a Cxn in a later than their German homologues. given dimension. PRESENCE OF MODIFIERS. This index en- Our data cannot be explained away as a Entropy is a measure of randomness, calculated codes how variable is the presence or ab- simple by-product of word-frequency effects. Experiments provide, in fact, evidence of as the average uncertainty of a single variable: sence of adjectives, adverbs or prepositional phrases modifying the available slots. In this interactive processing effects in word acquisition, way, it is possible to account for patterns whereby morphological regularity modulates H(X) = p(x) log2(p(x)) (1) − like:gettare#molta+acqua#su ∅+fuoco. frequency. Data analysis shows that recurrent x X Figure 2: Time course of regular (left) and irregular (right) patterns appear to determine global co- X∈ paradigms ranked by increasing learning epoch under SD DISTANCE VARIABILITY. This index exploits organisation of stored word forms and distributed, This measure of randomness can be adapted to our (grey circles) and UD (white circles) regimes for both information on linear order available in SYMPA- overlapping memory traces, which ultimately needs by taking the variable X as being a Cxn of Italian (top) and German (bottom). Values are averaged Thy to estimate how variable is the distance in to- across 5 map instances for each type. favour generalisation in lexical acquisition. Forms interest, and the states of the system x as its values kens between a TL and the other constituents of a containing recurrent patterns can take advantage on one dimension of variation. Lower entropy val- given lexically specified Cxn. 4 Frequency by regularity interaction of the memory traces shared with other related ues are to be understood as evidence of fixedness, forms, namely forms sharing the same stem, and while higher values suggest a more variable dis- In the experiment reported in the next section, Our simulations show that, in both languages, connections between the nodes making up their tribution of the states of a given variable, i.e. the we have combined the single variability measures word forms in regular paradigms tend to be memory traces are strengthened since patterns are target construction tends to be freer. Hrel(X) into an overall flexibility index F (X) acquired earlier (significantly earlier learning shown more often in training, similarly to high- Observed entropy values, however, can span corresponding to four possible combinations: epochs, p<.001), and regular paradigms are frequency isolated words. from 0 to the logarithm of the number of values SUM: F (X) is obtained by summing over all acquired more quickly (significantly shorter This is particularly true for regular, highly • learning spans, i.e. lower number of epochs that X can assume. As a consequence, entropy the single H (X) values; entropic paradigms, i.e. those regular paradigms rel between the acquisition time of the first and the values related to different dimensions of variation whose members exhibit uniform frequency AVERAGE: F (X) is the mean of the single last member of a paradigm, p<.005) than irregular are not comparable, and cannot be combined into • distributions, and for irregular highly systematic Hrel(X) values; paradigms are. In German data, regular paradigms paradigms. Conversely, where memory traces a single fixedness index. We overcome this limita- AVERAGEPOS: F (X) is the mean of the posi- are less sensitive to token frequency effects than overlap less systematically, this effect is tion by following Wulff (2008) and describing the • irregular paradigms are, as witnessed by the tive Hrel(X) values; considerably reduced, as witnessed by the randomness of each variability dimension in terms strong correlation (r=.95, p<.00001) between the difference in time of acquisition between regular MAX: F (X) is the highest H (X) value. of relative entropy, computed as the ratio between • rel time course of acquisition of regular paradigms in and irregular paradigms, particularly in Italian the observed entropy from eq.1 and the maximum We leave to future research the investigation of SD and UD regimes (Fig. 2, bottom left panel). conjugation. entropy Hmax for the variable X: further ways to combine the variability indexes. Token frequency affects the acquisition of regular In TSOMs, the effects are the dynamic result paradigms to a lesser extent than the acquisition of two interacting dimensions of memory self- H(X) H(X) of irregular ones, because regular stems can take organisation: (i) the syntagmatic or linear Hrel(X) = = (2) 4 Evaluation Hmax(X) log ( X ) advantage of their cumulative frequency across dimension, which controls the level of 2 | | In order to evaluate our approach, we set out to test the whole paradigm. In fact, forms in regular predictability and entrenchment of memory traces This measure, that ranges from 0 to 1, has been if our indexes can mimic the intuitive judgments paradigms exhibit a significant correlation in the lexicon through the probabilistic employed as a flexibility measure to describe the of native speakers about the fixedness of fully lex- between stem cumulative frequency and time of distribution of weights over inter-node Hebbian flexibility of a given set of target Cxns along the ically specified constructions. To do so, we se- acquisition (r=-.40, p<.00001). Similarly, also connections; and (ii) the paradigmatic or vertical following dimensions of variation: lected a subset of the idioms in the norms collected German irregular paradigms, which exhibit a dimension, which controls for the number of

146 39 subj#obj#comp-su morphosyntactic features: gender, number, similar, paradigmatically-related word forms that (Fig. 3, bottom). We observe, in fact, a highly • • get co-activated when one member of a paradigm significant correlation (r=.49, p<.00001 for both – OBJ Filler: acqua, ombra, benzina, ... ; finiteness, tense, etc. Substance, Natural{ Phenomenon, ... } is input to the map (Pirrelli et al., 2014). datasets) between levels of filtering and words’ { } – COMP-su Filler: fuoco, tavolo, bilancia, las- 3 WoC fixedness with SYMPAThy High-frequency words develop quick learning epochs. trico, istituzione, ...{ ; Artifact, Substance, ... entrenchment of Hebbian connections, which High-frequency words predictably show } { } Since constructions span along a continuum be- subj#obj#comp-in eventually cause high levels of node activation in higher activation levels than low-frequency • tween fixedness and productivity, there have been their memory traces and sparser co-activation of words, with an interesting difference of the – OBJ Filler: scompiglio, sasso, corpo, fumo, ca- davere, ... ;{ Natural Object, Substance, ... various attempts at measuring how fixed a given memory traces of other words. Strong connections interaction of frequency and activation levels of } { } – COMP-in Filler: panico, caos, sconforto, mare, WoC is, mostly based on surface features. Nissim and high activation levels mean high expectations regulars and irregulars. High-frequency, highly stagno, cestino, ...{ ; Feeling, State, ... and Zaninello (2011) assess the fixedness of a sub- for frequently activated memory traces, which are irregular words (e.g. German ist or Italian è) are } { } subj#obj set of complex nominals by comparing inflected thus recalled more easily and are less confusable stored in isolation, with highly-activated memory • and lemmatized forms, and taking into account the with other neighbouring words. Likewise, in nodes and no co-activation with other words. As a – OBJ Filler: spugna, base, ombra, acqua, luce, ponte, ... ; {Substance, Artifact, ... proportion of elements that undergo variation in a regular and sub-regular paradigms, sharing result, they require little filtering to be recalled } { } given MWE. Inflection is also used by Squillante memory traces can strengthen connections and and are acquired considerably quickly. High- At this point, we observe that all these words are raise node activation levels, since all related forms frequency regular paradigms, despite in both (2014) on noun-adjective expressions, and is com- typically associated with our TL, but we don’t bined with two other measures, interruptibility and can take advantage of the memory traces shared Italian and German training sets their average know in which way they are all linked to one with other members of the same paradigm. frequency is nearly half the average frequency of substitutability. Zeldes (2013) extends Baayen’s another. For instance, we have no elements high-frequency irregulars, show comparable morphological productivity approach to argument levels of activation with high-frequency for thinking that subj#gettare#acqua#su fuoco is structure and estimates the productivity of a syn- irregulars, due to the facilitatory effect of having any different from subj#gettare#acqua#su tavolo tactic slot from the number of its hapax noun more words that consistently activate the same or subj#gettare#ombra#su istituzione. However, fillers. Wulff (2009) uses a set of morphosyntac- pattern of nodes. while gettare acqua sul fuoco ‘defuse’ is an id- tic indexes of variations and a collocation-based This evidence shows that regularity indeed iom in Italian, gettare acqua sul tavolo only has index of compositionality as variables in a regres- modulates the interaction between frequency and a literal meaning (‘throw water on the table’); sion study to determine fixedness. activation strength, and it gives a strong indication subj#gettare#fango#su istituzione is yet different, We extend the state of the art of the quantitative that acquisition of regulars is typically paradigm- since gettare fango su ‘defame’ is a fixed expres- approach to construction fixedness by exploiting based, whereas acquisition of irregulars is mostly sion, but the Filler istituzione ‘institution’ is just the potentialities of SYMPAThy to develop a se- item-based. one of many possibilities, so the expression is par- ries of corpus-based indexes able to describe the Surely, as the notion of paradigm regularity tially fixed, resulting in something like [gettare fixedness of some idiomatic expressions. Our ap- is inherently graded, some verb systems show fango su PERSON/INSTITUTION]. The signif- higher sensitivity to these effects than others. This proach is then evaluated by comparing, for a sam- Figure 3: Levels of activation strength (top) and filtering icance of gettare acqua sul fuoco with respect is illustrated by German sub-regular paradigms, ple list of expressions, a composition of our in- (bottom) for Italian (left) and German (right), for four to gettare acqua sul tavolo emerges much more regularity-by-frequency classes. Low-frequency is set which present fewer and more predictable stem dexes against the behavioral judgments of syntac- clearly if we use a P-based method. Extracting below the first quartile of frequency distributions in the alternants than Italian sub-paradigms, and thus tic flexibility collected by Tabossi et al. (2011). two training sets, while high-frequency being set above the larger stem-sharing word families. Accordingly, surface material, the former expression will be third quartile. TSOMs allocate comparatively higher levels of ranked higher than the latter (given the pattern “V 3.1 The combinatory behaviour of a TL N PREPART N”) as the association between all This dynamic provides an algorithmic activation to low-frequency German sub-regulars In the SYMPAThy model, the combinatory space account of the observation that regularity favours and acquire them earlier than their Italian words is stronger. of a Target Lexeme is assumed to be formed by a acquisition of both high- and low-frequency homologues. So, fine-grained differences do not emerge with network of Cxns, varying for their degree of fixed- words, as shown in Fig. 3, where we compare The evidence reported here establishes, in our the S-method, while the P-based method fails to ness/productivity. For any given TL such a repre- average levels of activation for four classes of view, an important connection between aspects of capture the higher-level generalizations we get sentation is built by means of the following four- training word forms: low-frequency regulars, low morphological structure, frequency distributions with the S-method. In order to get the best of both step procedure: frequency irregulars, high-frequency regulars and of words in paradigms, and lexical acquisition in worlds, we extracted corpus data into SYMPA- 1. its SYMPAThy patterns are extracted from a high-frequency irregulars. 1 concurrent, competitive storage. Acquisition of Thy (SYntactically Marked PATterns), a database reference corpus; Activation levels of low-frequency words redundant morphological patterns play an where information on both levels is stored and ac- increasingly important role in an emergent appear to be significantly stronger within regular cessible jointly: 2. the set of single and multiple slot Cxns that TL paradigms than within irregular paradigms (Fig. lexicon, shifting acquisitional strategies from rote combines with are semi-automatically identi- memorisation (typical of irregular low-entropy syntactic frames with argument slots and fillers; 3, top). Stronger activation levels make patterns • fied. An example for the verb gettare is re- less confusable and easier to be accessed, as paradigms) to dynamic memory-based linear order of all elements for each TL; ported and explained in Appendix 1; 2 generalisation. • witnessed by the lower level of filtering required POS tag for each element (simple preposition 3. each construction is associated with a varia- for activation patterns to be recalled accurately • vs. preposition with article, definite vs. indefi- tional profile formed by a number of statistics nite article, modal vs. full verb, etc.); extracted from the SYMPAThy pattern to esti- 1 Frequency thresholds are set below the first quartile (low 2 Filtering an integrated activation pattern refers to the flexive form gettarsi ’throw oneself’ and objectless forms are mate: i) the variability of the fillers that instan- frequency) and above the third quartile (high frequency) in process of bringing down to zero the levels of activation of excluded. tiate the syntactic slots of constructions; ii) the the frequency distribution of training word forms. nodes that do not reach a set threshold.

40 145 Mapping the Constructicon with SYMPAThy. References Lingue e Linguaggio , XIII (2): 263-290. Italian Word Combinations between fixedness and productivity Maria Alegre and Peter Gordon. 1999. Frequency Claudia Marzi. 2014. Models and dynamics of the morphological lexicon in mono- and bilingual Alessandro Lenci Sara Castagnoli Malvina Nissim effects and the representational status of regular inflections. Journal of Memory and Language , 40: acquisition. Unpublished PhD Dissertation. University of Groningen Gianluca E. Lebani, Marco S. G. Senaldi Francesca Masini 41-61. University of Pavia. University of Pisa University of Bologna [email protected] www.comphyslab.it/redirect/?id=claudia.marzi.en_phd Harald R. Baayen, Richard Piepenbrock and Leon [email protected] [email protected] Gulikers. 1995. The CELEX Lexical Database (CD- Margherita Orsolini, Rachele Fanari and Hugo Bowles. [email protected] [email protected] ROM). Philadelphia: Linguistic Data Consortium. 1998. Acquiring regular and irregular inflections in [email protected] a language with verb classes. Language and Harald R. Baayen. 2007. Storage and computation in cognitive processes , 13(4): 425-464. the mental lexicon. In G. Jarema and G. Libben Abstract (P-level) and at the more abstract level of syntac- (eds.), The Mental Lexicon: Core Perspectives , 81- Steven Pinker and Michael Ullman. 2002. The past and future of the past tense. Trends in Cognitive Science , tic structure (S-level). These two levels are often 104. Amsterdam: Elsevier. This work introduces SYMPAThy, a data 6: 456-463. kept separate, not only theoretically, but also com- Lucia Colombo, Alessandro Laudanna, Maria De representation model in which the com- putationally, as their performance varies according Martino and Cristina Brivio. 2004. Regularity Vito Pirrelli, Claudia Marzi and Marcello Ferro. 2014. binatorial properties of a lexical item are Two-dimensional Wordlikeness Effects in Lexical to the different types of combinations that we want and/orconsistency in the production of the past described by merging surface and deeper Organisation. In: Basili R., Lenci A., Magnini B. to track (Sag et al., 2002; Evert and Krenn, 2005). participle? Brain and Language , 90: 128-142. linguistic information. The proposed ap- (eds.) Proceedings of the First Italian Conference We advocate a unified and integrated view of a Ewa Dabrowska. 2004. Rules or schemata? Evidence on Computational Linguistic , December 9-11, 2014. proach is then evaluated by comparing, lexeme’s combinatory potential, in order to cap- from Polish. Language and cognitive processes , 19 301-305, Pisa: Pisa University Press. for a sample list of verbal idioms, a set (2): 225–271. ture both fixed combinations (MWEs of various of SYMPAThy-based fixedness indexes Vito Pirrelli, Marcello Ferro and Basilio Calderone. types) and more productive aspects of the lexeme’s Ewa Dabrowska. 2005. Productivity and beyond: 2011. Learning paradigms in time and space. against the relevant speaker-elicited in- distributional behaviour. The theoretical premises mastering the Polish genitive inflection. Journal of Computational evidence from Romance languages. dexes available in the descriptive norms child language , 32: 191-205. lie in the constructionist view of the mental lex- In M. Maiden, J. C. Smith, M. Goldbach and M. O. collected by Tabossi et al. (2011). Hinzelin (eds.), Morphological Autonomy: icon outlined above, whereas a proposal for a Nick C. Ellis and Richard Schmidt. 1998. Rules or Perspectives from Romance Inflectional Associations in the Acquisition of Morphology? 1 Word combinatorics and constructions computational implementation is illustrated here. Morphology , 135-157. Oxford: Oxford University The Frequency by Regularity Interaction in Specifically, we i) present SYMPAThy, a model Press. By “Word Combinations” (WoCs) we broadly re- Human and PDP Learning of Morphosyntax. of data representation that takes into account both fer to the range of constructions typically as- Language and Cognitive Processes , 13: 307-336. Vito Pirrelli. 2000. Paradigmi in morfologia. Un surface and deeper linguistic information; ii) de- approccio interdisciplinare alla flessione verbale sociated with a lexical item. In Construction Marcello Ferro, Giovanni Pezzulo and Vito Pirrelli. velop and test an index of productivity for Italian dell'italiano . Pisa-Roma: Istituti editoriali e Grammar, constructions (Cxn) are convention- 2010. Morphology, Memory and the Mental WoCs based on SYMPAThy. poligrafici internazionali. alized form-meaning pairings that can vary in Lexicon. In Pirrelli, V. (ed.), Lingue e Linguaggio , both complexity and schematicity (Fillmore et al., IX(2): 199-238. Kim Plunkett and Virginia Marchman. 1993. From rote 2 SYMPAThy: a joint approach to WoCs learning to system building – acquiring verb 1988; Goldberg, 2006; Hoffmann and Trousdale, Jan Koutnik. 2007. Inductive Modelling of Temporal morphology in children and connectionist nets. 2013). The Constructicon spans from fully spec- We argue that to obtain a comprehensive picture of Sequences by Means of Self-organization. In Cognition , 48: 21-69. ified structures (kick the bucket) to complex, pro- the combinatory potential of a word and enhance Proceeding of International Workshop on Inductive ductive abstract structures such as argument pat- extracting efficacy for WoCs, the P-based ap- Modelling . Prague: 269-277. David E. Rumelhart and James L. McClelland. 1986. On learning the past tense of English verbs. In terns (e.g., the Ditransitive Cxn “Subj V Obj1 proach (which exploits sequences of POS-patterns Marie Labelle and Lori Morris. 2011. The acquisition McClelland, J.L. and Rumelhart, D.E. (eds.) and association measures) and the S-based ap- of a verbal paradigm: Verb Morphology in French Obj2”, she baked him a cake), passing through Parallel distributed processing , 217-270. L1 children. Prépublication. (Montréal, Québec, “intermediate” Cxns with different degrees of proach (which exploits syntactic dependencies and Cambridge: MIT Press. Canada, UQAM, département de linguistique). schematicity, complexity and productivity (e.g., association measures) should be combined. We il- lustrate this point with an example based on the http://www.archipel.uqam.ca/3992/1/Labelle- take Obj for granted), in what is known as the Morris_AcquisitionVerbalParadigm.pdf 1 lexicon-syntax continuum. WoCs thus comprise Target Lexeme (TL) gettare ‘throw’ (V). We want to use S-based methods to capture the Verena Lyding, Egon Stemle, Claudia Borghetti, so-called Multiword Expressions (MWEs), i.e. a Marco Brunello, Sara Castagnoli, Felice variety of recurrent expressions acting as a sin- fact that V occurs typically within some syntac- Dell'Orletta, Henrik Dittmann, Alessandro Lenci gle unit at some level of linguistic analysis, like tic Frames and not others, that for each Frame and Vito Pirrelli. 2014. The PAISÀ Corpus of phrasal lexemes, idioms, collocations (Calzolari et we have typical Fillers (lexical items) instantiating Italian Web Texts. In F. Bildhauer and R. Schäfer al., 2002; Sag et al., 2002; Gries, 2008), as well as Frame slots, and that each slot is associated with (eds.) Proceedings of the 9th Web as Corpus 2 Workshop (WaC-9) : 36-43. Gothenburg. the preferred distributional properties of a word at certain semantic (ontological) classes: a more abstract level, i.e. argument structures and 1 Claudia Marzi, Marcello Ferro and Vito Pirrelli. 2012. All data is from a version of the “la Repubblica” corpus Word alignment and paradigm induction. Lingue e selectional preferences (Goldberg, 1995). (Baroni et al., 2004) POS tagged with the Part-Of-Speech tag- Linguaggio , XI (2): 251-274. Each lexeme can thus be described as having a ger described in Dell’Orletta (2009) and dependency parsed with DeSR (Attardi and Dell’Orletta, 2009). combinatory potential to be defined and observed 2 Claudia Marzi, Marcello Ferro and Vito Pirrelli. 2014. Data extracted by LexIt (Lenci, 2014). The list is partial: Morphological structure through lexical parsability. at a more constrained, surface POS-pattern level only the first three Frames are included; Frames with the re-

Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

144 41 Morphological Priming in German: The Word is Not Enough (Or Is It?) References [Левенштейн1965] Владимир Иосифович Левен- ˇ штейн. 1965. Двоичные коды с исправлением Sebastian Pado´∗ Britta D. Zeller∗ Jan Snajder† выпадений, вставок и замещений символов. ∗Stuttgart University, Institut fur¨ maschinelle Sprachverarbeitung Докл. АН СССР, 4(163):845–848. Pfaffenwaldring 5b, 70569 Stuttgart, Germany [Dvoncˇ et al.1966] Ladislav Dvonc,ˇ Gejza Horák, Fran- †University of Zagreb, Faculty of Electrical Engineering and Computing tišek Miko, Jozef Mistrík, Ján Oravec, Jozef Unska 3, 10000 Zagreb, Croatia Ružicka,ˇ and Milan Urbancok.ˇ 1966. Morfológia pado,zeller @ims.uni-stuttgart.de [email protected] slovenského jazyka. Vydavatel’stvo SAV, Bratislava, { } Slovakia, 1st edition. 895 p. [Garabík and Šimková2012] Radovan Garabík and Abstract in German, where the base verb and derived verb Mária Šimková. 2012. Slovak Morphosyntactic can be semantically related (transparent deriva- Tagset. Journal of Language Modelling, 0(1):41– Studies across multiple languages show tion: schließen – abschließen (close – lock)) or 63. that overt morphological priming leads to not (opaque derivation: fuhren¨ – verfuhren¨ (lead – [Garabík2007] Radovan Garabík. 2007. Slovak mor- a speed-up only for transparent derivations seduce)). Experiment 1, an overt visual priming ex- phology analyzer based on Levenshtein edit oper- but not for opaque derivations. However, periment (300 ms SOA) involved 40 six-tuples that ations. In M. Laclavík, I. Budinská, and L. Hlu- in a recent experiment for German, Smolka paired up a base verb with five prefix verbs of five chý, editors, Proceedings of the WIKT’06 confer- et al. (2014) show comparable speed-ups prime types (see Figure1). The verbs were normed ence, pages 2–5, Bratislava. Institute of Informatics SAS. for transparent and opaque derivations, and carefully, e.g., for association, to exclude confound- conclude that German behaves unlike other ing factors. The authors reported three main find- [Hajic2004]ˇ Jan Hajic.ˇ 2004. Disambiguation of Rich Indo-European languages and organizes its ings: (a), no priming for Form and Unrelated; (b), Inflection (Computational Morphology of Czech). mental lexicon by morphemes rather than Karolinum, Charles Univeristy Press, Prague, Czech no priming for Synonymy; (c), significant prim- Republic. lemmas. In this paper we present a com- ing of the same strength for both Transparent and putational analysis of the German results. Opaque Derivation. [Karcová2008]ˇ Agáta Karcová.ˇ 2008. Príprava a us- kutocˇnovanieˇ projektu morfologického analyzátora. A distributional similarity model, extended These findings suggest that morphological prim- with knowledge about morphological fami- In Anna Gálisová and Alexandra Chomová, edi- ing on German prefix verbs use a mechanism that tors, Varia. 15. Zborník materiálov z XV. kolokvia lies and without any notion of morphemes, is different from lexical priming, which assumes mladých jazykovedcov, pages 286–292, Bratislava. is able to account for all main findings of that the strength of the semantic relatedness is the Slovenská jazykovedná spolocnost’ˇ pri SAV – Ka- Smolka et al. We believe that this puts into tedra slovenského jazyka a literatúry FHV UMB main determinant of priming – i.e., lexical prim- v Banskej Bystrici. question the call for German-specific mech- ing would predict finding (a), but neither (b) nor anisms. Instead, our model suggests that (c). The findings by Smolka et al. are also at odds cross-lingual differences between morpho- with overt priming patterns found in similar experi- logical systems underlie the experimentally mental setups for other languages such as French observed differences. (Meunier and Longtin, 2007) and Dutch (Schriefers et al., 1991), where patterns were found to be in- 1 Semantic and Morphological Priming deed consistent with lexical priming. Smolka et Priming is a general property of human language al. (2014) interpret this divergence as evidence for processing: it refers to the speed-up effect that a German Sonderweg: the typological properties a stimulus can have on subsequent processing of German (separable prefixes, morphological rich- (Meyer and Schvaneveldt, 1971). This effect is ness, many opaque derivations) are taken to suggest assumed to result from an activation (in a broad a morpheme-based organization of the mental lexi- sense) of mental representations, and priming is con more similar to Semitic languages like Hebrew a popular method to investigate properties of the or Arabic than to other Indo-European languages. mental lexicon. The original study by Meyer and Our paper investigates this claim on the compu- Schvaneveldt established lexical priming (nurse tational level. We present a simple model of corpus- → doctor), but priming effects have also been iden- based word similarity, extended with a database of tified on other linguistic levels, such as syntactic morphological families, that is able to predict the priming (Bock, 1986) and morphological priming three main findings by Smolka et al. outlined above. (Kempley and Morton, 1982). The ability of the model to do so, even though it op- A recent study by Smolka et al. (2014) investi- erates completely at the word level without any no- gated overt morphological priming on prefix verbs tion of morphemes, may put into question Smolka

Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

42 143 plete. We assume that at least the most common emes have recall 1, and about 50 lexemes have Target binden (bind) that is, sets of lemmas that are derivationally (ei- ≈ inflectional paradigms (used for proper nouns) are 0.9 ' recall ' 0.6, while only a small number 1 Transparent Derivation zubinden (tie) ther transparently or opaquely) related (Daille et present in the training set. of lexemes have lower precision. The lower recall 2 Opaque Derivation entbinden (give birth) al., 2002), such as: is caused by insufficient data coverage – not all 3 Synonym zuschnuren¨ (tie) number of lemmata assigned knienV (to kneelV ), beknienV (to begV ), word per word form the word forms were present in the analysed cor- 4 Form abbilden (depict) KniendeN (kneeling personN ), kniendA forms [%] correct all pus. The precision we obtained is excellent and the 5 Unrelated abholzen (log) 100 18.9 0 1 accuracy of automatic lemma assignment is good. (kneelingA), KnieN (kneeN ) 4 0.8 0 2 418 79.2 1 1 Figure 1: Smolka et al. (2014)’s five prime types While our motivation was primarily computational 6 1.1 1 2 5 Augmenting Morphological Database (we aimed at improving similarity estimates for in- Σ 528 100.0 The abovementioned process was used to increase et al.’s call for novel morpheme-level mechanisms frequent words by taking advantage of the shared Table 2: Number of automatically assigned lem- the number of proper nouns in Slovak morpholog- for German. meaning within derivational families), these fam- mata per word form. ical database. We used the extracted candidates ilies can be reinterpreted in the current context as from the prim-6.1-public-all corpus with a number 2 Modeling Priming driving morphological generalization in priming. of occurrences at least 100 (count of all possible More specifically, consider the following model We model the priming effects shown in Smolka et 4 Evaluation word forms derived from a given lemma). We cal- family, which we call MORGEN and which is an al. by combining two computational information asymmetrical version of the “Average Similarity” We used the algorithm to extract proper nouns culate the coverage of word forms for one lemma as r = C(w, t)/C(g), where C(w, t) is the num- sources: A distributional semantic model, and a model from Pado´ et al. (2013): from the Slovak National Corpus, version prim- derivational lexicon. 6.1-public-all3, of the size 829 million tokens, and ber of generated tuples of word forms and their 1 corresponding morphological tags, and C(g) the priming (p, t) cos p~ ,~t evaluated the results on the proper nouns from the Distributional Semantics and Priming. Distri- MORGEN ∝ N 0 number of grammar categories (usually 7 or 14; 7 p (p) evaluation set. The percentage of correctly auto- butional semantics builds on the distributional hy- 0∈DX   cases including the vocative and one or two gram- matically assigned lemmata is shown in Table2– pothesis (Harris, 1968), according to which the This model predicts priming as the average similar- matical numbers, with many proper nouns present we see that 79.2% word forms had been assigned similarity of lemmas correlates with the similar- ity between the target t and all lemmas p within only in singular). 0 a unique lemma, which was also the correct one, ity of their linguistic contexts. The meaning of a the derivational family of the prime p. It opera- After removing generated word forms with no while 18.9% had been assigned a unique, but incor- lemma is typically represented as a vector of its con- tionalizes the intuition that the prime “activates” 4 corpus evidence, the average coverage of word rect lemma . texts in large text collections (Turney and Pantel, its complete derivational family, no matter if trans- forms per lemma is r = 0.84 0.23, i.e. 84% ± 2010; Erk, 2012), and semantic similarity is oper- parently or opaquely related. Each of the family 70 0.23 precision of word forms is present in the corpus, is the ationalized by using a vector similarity measure recall members then contributes to the priming effect just 60 standard deviation of the coverage. Generated word such as cosine similarity. Traditional models con- like in standard lexical priming. 50 forms still contain a lot of noise, therefore we also struct vectors directly from context co-occurrences, The MORGEN model should have a better 40 removed those word forms whose contribution to while more recent models learn distributed repre- chance of modeling Smolka et al.’s results than 30 the number of occurrences of given lemma was less frequency sentations with neural networks (Mikolov et al., the DISTSIM model. Note, however, that it re- than 1% (it is rare for a grammatical case to have 20 2013), which can be seen as advanced forms of mains completely at the word level, with deriva- such a low percentage compared to other cases). Af- 10 dimensionality reduction. tional families as its only source of morphological ter this, the coverage changed to r = 0.75 0.24, 0 ± A classical test case of distributional models is knowledge. 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 where again 0.24 is the standard deviation of the exactly lexical priming, which has been modeled precision, recall coverage. Then we manually proofread, corrected successfully in a number of studies (McDonald and 3 Experiment and filled in the word forms for the several hun- Figure 2: Histogram of precision and recall on au- Lowe, 1998; Lowe and McDonald, 2000). The We compute a DISTSIM model by run- dred most frequent lemmata. After adding these assumption of this model family, which we call Setup. tomatically assigned word forms of the lexeme(s) ning word2vec (Mikolov et al., 2013), a system words to the morphological database, we iterated DISTSIM, is that the cosine similarity between a for the evaluation data. to extract distributional vectors from text, with its the process, re-training the algorithm and generat- prime vector ~p and a target vector ~t is a direct pre- default parameters, on the lemmatized 800M-token Figure 2 displays the precision and recall on ing another list of less frequent proper nouns. dictor of lexical priming: German web corpus SdeWaC (Faaß and Eckart, word forms for proper nouns (i.e. how much of 6 Conclusion ~ 2013). To build MORGEN, we use the deriva- the lexeme has been extracted; the numbers are not priming DISTSIM(p, t) cos ~p, t ∝ tional families from DERIVBASE v1.4, a semi- weighted by the frequency of word forms in the The method has been used to improve the cover-  automatically induced large-coverage German lexi- corpus) from the evaluation set; we note that about age of proper nouns in the Slovak morphological Regarding morphological priming, this model pre- con of derivational families (Zeller et al., 2013).1 70 lexemes5 have precision 1; about 40 lex- database and is used as a part of morphological dicts the result patterns for French and Dutch but ≈ should not be able to explain the German results. 1DERIVBASE defines derivational families through a set 3 guesser, providing candidate lemmata and morpho- http://korpus.juls.savba.sk/res.html of about 270 surface form transformation rules. MORGEN 4 logical tags for unknown proper nouns, as part of For 13 word forms (2.5%) the correct lemma was not Derivational Morphology in a Distributional does not use information about rules, only family membership. the morphosyntactic analysis and part of speech present in the interval of 2000 words. Model. In Pado´ et al. (2013), we proposed to Nevertheless, it is a question for future research to assess 5Since the number of proper nouns in our evaluation set tagging of the Slovak National Corpus.6 the potential criticism that the rule-based induction method was 101, these numbers are fortuitously almost identical to extend distributional models with morphological implicitly introduces morpheme-level information into the percentage. 6http://korpus.juls.savba.sk knowledge in the form of derivational families , families. D

142 43 IST IM OR EN Prime Type Exp. 1 (RT in ms) D S (cos sim) M G (cos sim) . . . Toska†‡‡‡‡†† Toskala Toskalu Toskánec Toskánska Toskánske Toskánsko Toskánskom Toskánsku Toske Tosky Toso . . . (Smolka et al.) (our model) (our model) 10 33 28 20 221 11 110 20 304 15 26 16

1 Transparent Derivation 563∗∗ 0.44∗∗∗ 0.35∗∗∗ Table 1: Alphabetic list of proper noun candidates, with number of occurances in the corpus. Note the 2 Opaque Derivation 566∗∗ 0.28 0.35∗∗∗ extracted lemmata/lexemes Toska† (La Tosca), Toskánsko‡ (Tuscany), as well as unrelated Toskala, 3 Synonym 580 0.41∗∗∗ 0.30∗ Toso and related Toskánec (inhabitant of Tuscany) and Toskánske (Tuscan, adjective). 4 Form 600 0.24 0.26 5 Unrelated 591 0.25 0.27 Fig.1 displays the distribution of known com- interval width to be 2000 words – increasing it mon (top) and proper (bottom) nouns, summed and above this number does not improve the accuracy Table 1: Average Reaction Times and cosines, respectively. Significance results compared to level normalized through all the nouns in the training anymore and the speed is acceptable. It should be Unrelated. Correct contrasts shown in boldface. Legend: : p<0.05; : p<0.01; : p<0.001 ∗ ∗∗ ∗∗∗ set. Vertical error bars display the standard devi- noted that this interval is not a width of the context ation for the given distance of word form from of the concordance – this is an interval in the lexico- Following Smolka et al., we analyze the predic- 1. For Opaque Derivation, MORGEN typically the lemma. From the graphs, we derive several graphically ordered set of proper noun candidates tions with a series of one-way ANOVAs (factor predicts stronger priming than DISTSIM, conclusions – proper nouns are “less inflected”, extracted from a given text, e.g. from a novel if we Prime Type with reference level Unrelated). As since prime and target are typically members higher ratio of them is in the basic form (lemma), want to extract the whole inflectional paradigms of appropriate for multiple comparisons, we adopt a of the same derivational family (assuming that and the maximum distance is ρ = 7 for common (new, unknown) proper nouns from the novel, or in- more conservative significance level (p=0.01). there are no coverage gaps in DERIVBASE), nouns (nouns with greater distance are those with deed from the whole corpus, if we aim to augment and the average similarity between the target very irregular declension, e.g. človek ľudia “hu- a morphological database. Table1 reports the experimental results → Results. and the words in the family is higher than the man/humans”) and ρ = 5 for proper nouns. Dis- We formally describe a Levenshtein edit opera- and model predictions (average experimental reac- similarity to the prime itself. Taking Figure1 tributions of common and proper nouns from the tion e = (o, is, id) – a triple of operation type o, tion times, cosine model predictions, and signifi- as an example, the Opaque Derivation pair evaluation set match those from the training set, so position is in the source string s and position id cance of differences). Model contrasts that match entbinden (give birth) – binden (bind) is rela- there appears to be some difference between com- in the destination string d, where operation type o experiment contrasts are marked in bold. tively dissimilar, and the similarity increases mon and proper nouns globally. However, categoris- is one of replace, insert or delete. For replace or As expected, DISTSIM predicts the patterns of when other pairs like binden (bind) – zubinden ing single nouns using these differencies between insert, the replacement/new character is taken from classical lexical priming: we observe significant (tie) are taken into consideration. distributions is not reliable. the destination string d. priming effects for Transparent Derivation and Syn- Sequence of edit operations q = (e1, e2, e3, ...), onymy, and no priming for Opaque Derivation. 2. For Synonymy, MORGEN typically predicts 3 Extracting Candidates together with the destination string d, when applied This is contrary to Smolka et al.’s experimental weaker priming than DISTSIM, since the av- to a string s S defines a mapping fq,d : Sq,d S, results. erage similarity between target and all mem- ∈ 2 7→ Our algorithm extracts plausible candidates for where Sq,d and S are sets of strings. Our instance of the MORGEN model does a bers of the prime’s family tends to be lower proper nouns (those beginning with a capital let- If we denote by t a morphological tag for a given much better job: It predicts highly significant prim- than the similarity between target and original ter but not at the beginning of a sentence, together word form w, then for a lexeme with a lemma l a ing effects for both Transparent and Opaque deriva- prime. Again considering Figure1, the Syn- with some additional filters) and for each candi- tuple (w, t) unambiguously refers to one inflected tions (p<0.001) while priming is not significant at onym pair binden (bind) – zuschnuren¨ (tie) date, it considers the set of words with ρ 5. This word form and its grammatical categories. We can p<0.01for Synonyms (p=0.04). These predictions is relatively similar, while including terms ≤ would require calculating the Levenshtein distance then construct a sequence of edit operations leading correspond very well to Smolka et al.’s findings (cf. derivationally related to the prime zuschnuren¨ between all pairs of words in the set and the com- from l to w, denoted by q(l, t). Table1). We tested for two additional contrasts an- (tie) like schnurlos (cordless) introduces low- 2 plexity would be O(n ), which is unacceptable for For each proper noun from the training set, we alyzed by Smolka et al.: the difference in priming similarity pairs like schnurlos (cordless) – corpus sized inputs. Unfortunately, Levenshtein dis- precompute the functions f (this can be im- strength between Transparent and Opaque Deriva- binden (bind). q(l,t),l tance is a metric but cannot be used to make an proved by dividing the nouns into categories based tion (not significant in either experiment or model) MORGEN is not the only model that takes a distri- ordered set out of a list of words (in particular, it on their declension rules and using only one noun and the difference between Transparent Derivation butional stance towards morphological derivation. cannot be used to define an ordering binary rela- from each category), to get the sequence of oper- and Synonym (highly significant in both experi- Marelli and Baroni (2014) propose a compositional tion ). ations leading from the lemma to the tuple (w, t) ment and model). ≤ model that computes separate distributional repre- However, a trick can be applied – in a lexico- of the word form and morphological tag. Then, for sentations for the meanings of stems and affixes graphically ordered list of words (see Table1) we 4 Discussion each extracted word, we apply the functions fq(l,t),l and is able to compute representations for novel, un- need to look only at some interval around the word; to every word from the abovementioned interval In sum, we find a very good match between MOR- seen derived terms. The morpheme-level approach word forms from beyond the interval are very un- and the word with greatest coverage (sum of the GEN and the experimental results, while the DIST- of Marelli and Baroni’s model corresponds more likely to belong to the same lexeme. The complex- frequencies of generated word forms within the in- SIM model cannot account for the experimental directly to Smolka et al.’s claims and might also be ity will be O(Cn); where C is the (constant) size of terval) is declared the lemma to the extracted word. evidence. Recall that the main difference between able to account for the experimental patterns. the interval. This means that for some of the nouns Of course, this maximum can be attained by more the two models is that MORGEN’s includes all However, our considerably simpler model, not all word forms will be covered; especially for than one word, especially if the lexeme is incom- members of the prime’s derivational family into which only has knowledge about distributional fam- the shorter ones, where there is a higher probabil- 2It is not possible to define the function f for every source the prediction of the priming strength. This leads ilies, is also able to do so. This at the very least ity that many unrelated words will be within the string, since some of the operations might not be applicable to to the following changes compared to DISTDIM: means that morpheme-level processing is not an interval. Empirically we estimated the reasonable the given strings.

44 141 indispensable property of any model that explains Lexicon, pages 7–8, Niagara on the Lake, Canada. Extraction and Analysis of Proper Nouns in Slovak Texts Smolka et al.’s experimental results and that the Abstract. evidence for a special organization of the German Scott McDonald and Will Lowe. 1998. Modelling Radovan Garabík Radoslav Brída mental lexicon, in contrast to other languages, must functional priming and the associative boost. In Ľ. Štúr Institute of Linguistics Ľ. Štúr Institute of Linguistics be examined more carefully. Proceedings of the 20th Annual Conference of the Slovak Academy of Sciences Slovak Academy of Sciences In fact, our model provides a possible alterna- Cognitive Science Society, pages 675–680, Madison, WI. Bratislava, Slovakia Bratislava, Slovakia tive source of explanations for the cross-lingual [email protected] [email protected] differences: Since the MORGEN predictions are Fanny Meunier and Catherine-Marie Longtin. 2007. directly influenced by the size and members of Morphological decomposition and semantic integra- the derivational families, German opaque morpho- tion in word processing. Journal of Memory and Language, 56:457–471. logical priming may simply result from the high 1 Introduction rare), and the inflection is mostly realized by chang- frequency of opaque derivations. In the future, we David E. Meyer and Roger W. Schvaneveldt. 1971. Fa- ing the suffix and root vowel alteration, we can ex- Unknown named entity recognition in inflected lan- plan to apply the model to Dutch and French to cilitation in recognizing pairs of words: Evidence of pect the overall distance between lemma and its a dependence between retrieval operations. Journal check this alternative explanation. guages faces several specific problems – the first word forms to be not only bounded from above, but of Experimental Psychology, 90(2):227–234. and foremost is that the entities themselves are in- also have a regular distribution (roughly speaking, flected1 (Dvoncˇ et al., 1966) leading to a problem Acknowledgments Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Cor- the less typical the suffix length, the less likely is rado, and Jeff Dean. 2013. Distributed representa- of identifying word forms as belonging to the same such a word form to appear). We gratefully acknowledge funding by Deutsche tions of words and phrases and their compositional- lexeme, and also the problem of finding correct Forschungsgemeinschaft through Sonderfor- ity. In Advances in Neural Information Processing We used the morphological database of Slovak lemma. In this article we analyse the distribution schungsbereich 732, project B9. Systems 26, pages 3111–3119. ˇ of word forms for proper nouns in Slovak and de- language (Garabík and Šimková, 2012; Karcová, Sebastian Pado,´ Jan Snajder,ˇ and Britta Zeller. 2013. scribe an algorithm for their automatic extraction 2008; Garabík, 2007), which contains (at the time of writing) complete morphological information Derivational smoothing for syntactic distributional and lemmatisation. References semantics. In Proceedings of the 51st Annual Meet- of 35 009 nouns (lemmata), out of which 1031 are The task of lemmatisation and morphological an- J. Kathryn Bock. 1986. Syntactic persistence in lan- ing of the Association for Computational Linguistics, proper nouns. We randomly divided the database notation of flective (and more specifically, Slavic) guage production. Cognitive Psychology, 18:355– pages 731–735, Sofia, Bulgaria. into two parts, the training set and the evaluation languages is reasonably researched and developed 387. set, ensuring that about 90% of both common and Herbert Schriefers, Pienie Zwitserlood, and Ardi (Hajic,ˇ 2004). Since we cannot expect a morpho- Roelofs. 1991. The identification of morphologi- proper nouns is present in the training set. The eval- Beatrice´ Daille, Cecile´ Fabre, and Pascale Sebillot.´ logical database (data relating lemmata to inflected 2002. Applications of computational morphology. cally complex spoken words: Continuous process- uation set contained 101 lemmata and 694 unique Journal of Memory and Lan- word forms and their grammatical tags) to cover all In Paul Boucher, editor, Many morphologies, pages ing or decomposition? word forms for proper nouns. guage, 30:26–47. or almost all the words present in the corpus (espe- 210–234. cially proper names that keep appearing depending Katrin Erk. 2012. Vector space models of word mean- Eva Smolka, Katrin H. Preller, and Carsten Eulitz. 1 2014. ‘verstehen’ (‘understand’) primes ‘stehen’ on who or what has become a hot topic in mass common nouns ing and phrase meaning: A survey. Language and Linguistics Compass, 6(10):635–653. (‘stand’): Morphological structure overrides seman- media), using a well tuned guesser can improve the 0.8 tic compositionality in the lexical representation of accuracy of lemmatisation and tagging. Gertrud Faaß and Kerstin Eckart. 2013. SdeWaC – a German complex verbs. Journal of Memory and Common sense says that named entities (proper 0.6 corpus of parsable sentences from the web. In Iryna Language, 72:16–36.

n/|N| Gurevych, Chris Biemann, and Torsten Zesch, ed- names in particular) behave differently from com- 0.4 itors, Language Processing and Knowledge in the Peter D. Turney and Patrick Pantel. 2010. From mon names, which translated into information the- frequency to meaning: Vector space models of se- 0.2 Web, volume 8105 of Lecture Notes in Computer Sci- ory terms means that the information about whether ence, pages 61–68. Springer Berlin Heidelberg. mantics. Journal of Artificial Intelligence Research, a word is a proper name is not independent from the 0 37(1):141–188. 0 1 2 3 4 5 6 7 Zellig Harris. 1968. Mathematical Structures of Lan- information about its morphology paradigm. This ˇ ρ(lemma,word) guage. Wiley. Britta Zeller, Jan Snajder, and Sebastian Pado.´ 2013. means we can use the information about proper 1 DErivBase: Inducing and evaluating a derivational proper nouns names to decrease the entropy of inflections, which Steve T. Kempley and John Morton. 1982. The ef- morphology resource for German. In Proceedings 0.8 fects of priming with regularly and irregularly re- is good because it helps the guesser choose between of the 51st Annual Meeting of the Association for lated words in auditory word recognition. British Computational Linguistics, pages 1201–1211, Sofia, the possible lemmata and morphological tags. 0.6 Journal of Psychology, pages 441–445. Bulgaria. n/|N| 2 Datasets 0.4 Will Lowe and Scott McDonald. 2000. The direct route: Mediated priming in semantic space. In Pro- 0.2 We denote Levenshtein distance (Ëåâåíøòåéí, ceedings of the 22nd Annual Conference of the Cog- nitive Science Society, pages 675–680, Philadelphia, l w ρ(l, w) 0 1965) between two words and by . Since PA. a typical Slovak noun has up to 12 different word 0 1 2 3 4 5 6 7 ρ(lemma,word) forms (two numbers, six cases – the vocative is Marco Marelli and Marco Baroni. 2014. Dissecting se- mantic transparency effects in derived word process- 1e.g. for the lemma Galileo, genitive would be Galilea, Figure 1: Distribution of word forms according to ing: A new perspective from distributional seman- dative Galileovi etc. their Levenshtein distance from lemma. tics. In 9th International Conference on the Mental

Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

140 45

What can distributional semantic models tell us about part-of relations? vates ‘boost’ the frequency of a base word and to what extent? What is the role of semantic trans- parency and phonological similarity between a

Franc¸ois Morlane-Hondere` derivate and its base form? How important is it for their connection whether they belong to one LIMSI-CNRS, Orsay, France part of speech or to one inflectional class? Would [email protected] stress shifts and alternations influence our re- sults? We hope to address some of these ques- tions in our further research. References 1 Introduction 2 Part-of relation and DSMs Alegre, M., & Gordon, P. 1999. Frequency effects As its name suggests, part-of relation – or and the representational status of regular inflec- The term Distributional semantic models (DSMs) meronymy1 – holds between a part – the meronym tions. Journal of Memory and Language, 40, 41-61. refers to a family of unsupervised corpus-based – and its whole – the holonym –, like in bed/pillow, Baayen, H., Wurm, L.H., & Aycock, J. 2007. Lexical approaches to semantic similarity computation. armor/steel or ostrich/feather. It is one of the cen- dynamics for low-frequency complex words: A re- These models rely on the distributional hypothe- tral relations used in knowledge representation. gression study across tasks and modalities. The Mental Lexicon, 2, 419-463. sis (Harris, 1954), which states that semantically Automatic extraction of part-of relations has related words tend to share many of their contexts. been addressed using many approaches, most of Lyashevskaya, O.N., & Sharoff, S.A. 2009. Chastotnyj Slovar’ Sovremennogo Russkogo So, by collecting information about the contexts which are pattern-based (Berland and Charniak, in which words are used in a corpus, DSMs are Jazyka (‘The Frequency Dictionary of Modern 1999; Girju et al., 2006; Pantel and Pennacchiotti, Russian Language’). Moscow: Azbukovnik. able to measure the distributional similarity of two 2006). However, the unsupervised nature of the Moscoso del Prado Martín, F., Bertram, R., Häikiö, words, which theoretically translates into a seman- distributional approach makes it an attractive al- tic one. T., Schreuder, R., & Baayen, R.H. 2004. Morpho- ternative. logical family size in a morphologically rich lan- In recent years, these models have become very Studies were conducted to assess the nature guage: The case of Finnish compared to Dutch and popular in a wide range of NLP tasks (Weeds, of the semantic relations extracted by distribu- Hebrew. Journal of Experimental Psychology: 2003; Baroni and Lenci, 2010), mainly because tional models – using human judges (Kuroda et Learning, Memory and Cognition, 30, 1271–1278. of the ever-increasing availability of textual data. al., 2010), thesauri (Morlane-Hondere,` 2013; Fer- Niswander-Klement, E., & Pollatsek, A. 2006. The Regardless of their use in NLP applications, distri- ret, 2015) or ad hoc datasets (Baroni and Lenci, effects of root frequency, word frequency, and butional data provide precious information about 2011). They showed that part-of relations are length on the processing of prefixed English words during reading. Memory and Cognition, 34, 685- present in varying proportions among distribution- words’ behaviour and their tendency to appear in 702. the same contexts. Yet, linguists have shown lit- ally similar words. This very presence is inter- tle interest in DSMs (Sahlgren, 2008). We believe esting in that unlike synonymy, hypernymy or co- Taft, M. 2004. Morphological decomposition and the reverse base frequency effect. Quarterly Journal of hyponymy, meronymy is not a similarity relation that this kind of information can be relied on to Experimental Psychology, 57A, 745-765. empirically assess the validity of linguistic theo- (Resnik, 1993; Budanitsky and Hirst, 2006): an ries. Conversely, by shedding light on underlying ostrich is not the same kind of thing as a feather, linguistic factors that influence distributional be- neither an armor is the same kind of thing as steel. haviours, linguistic studies can contribute to im- Following the distributional hypothesis, it is not prove our understanding of the results provided by expected that these kind of meronyms share a lot DSMs. of contexts. It appears, though, that a certain proportion This paper illustrates such a qualitative linguis- of them tend to do so. For example, in Ba- tic approach by investigating the presence of part- roni and Lenci (2010)’s DSM, player, pianist and of relations among distributionally similar French musician are among the ten most distributionally words. We compare distributional data and a set of similar words of orchestra. In the following of part-of relations provided by humans in a lexical this study, we compare the semantic properties network. In order to assess the nature of the part- of the meronyms which can be extracted using a of word pairs which can – or cannot – be found distributional approach and the properties of the in DSMs, these words were sense-tagged using meronyms which cannot. WordNet supersenses. Our results show consid- erable discrepancies between the representation of 1Some authors make a distinction between part-of relation part-of sense pairs in distributional data. and meronymy (Cruse and Croft, 2004).

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

46 139 lation between two words and their probability of We could not find prefixed noun pairs with a In two lexical decision experiments we con- 3 Methodology and data being extracted in a DSM. However, the typology similar distribution of frequencies in our materi- ducted, reactions times to prefixed verbs and 3.1 The part-of dataset als. However, no approach would predict that deverbal nouns depended on their own frequen- has proven to be inadequate, so we chose to an- they could be decomposed by stripping off their cies, which points to whole word storage. At the The first step consists in gathering a set of notate the words instead of their relation. This is prefix first anyway. So noun stimuli were includ- same time, reaction times to unprefixed verbs meronyms. Although efforts are made to provide also what we do in this study. This approach is in- ed mainly to make experimental materials more were influenced by the summed frequency of expert-built lexical semantic resources for French spired by the idea that the difference between the diverse, they will not let us tease apart different their derivates (created by prefixation and post- (Fiserˇ and Sagot, 2008; Pradet et al., 2014), there meronymic sub-relations is due to the semantic na- lexical access scenarios. fixation). We conclude that this effect is ex- is currently no freely-available equivalent – in ture of the words involved (Murphy, 2003). Results and discussion. We analyzed partici- plained not by decomposition of the derivates terms of quality and coverage – to WordNet (Fell- The above-mentioned lack of freely-available pants’ question-answering accuracy and reaction during lexical access, but by their strong connec- baum, 1998) or the Moby thesaurus (Ward, 2002) thesauri for French led us to use WordNet to per- times. All participants gave at least 85% of cor- tion to the word they are derived from. for French. So, we use the JeuxDeMots (JDM) form this task. Words of our dataset were 1) trans- rect answers (92,0% on average); trials with in- Our conclusion is confirmed by the data from lexical network (Lafourcade, 2007), which is a lated to English, 2) mapped to WordNet synsets correct answers were excluded from further deverbal nouns. On the surface (i.e. phonologi- GWAP (Game With A Purpose) in which players and 3) linked to their translation’s supersense(s). analysis. We also discarded all RTs that exceed- cally), the overlap between unprefixed and pre- are asked to provide words which can be in a given Supersenses – or lexicographer classes – are a set ed 1,5 s. In total, 0,4% of reactions to real stimuli fixed verbs on the one hand and unprefixed and relation with a given word2. of 44 coarse semantic categories used to classify were discarded. prefixed nouns on the other hand is the same: as 3 We demonstrated that this time, RTs for verbs examples from Tables 1 and 3 show, they coin- Although collaboratively-built lexical semantic WordNet’s noun, verb and adjective entries . Ex- and nouns differed depending on their whole cide once the prefix is stripped. If this factor resources have shown to be valuable (Gurevych amples of the 25 noun supersenses are GROUP, word frequencies. The difference was statistical- played a role, the results for unprefixed verbs and Wolf, 2010) and although a relation in LOCATION or FOOD. Supersenses were then man- ly significant both for prefixed verbs (RM and nouns would be the same. JDM must be provided by two different play- ually disambiguated (drawer can both belong to ANOVA, F1(1,23) = 17,87, p < 0,001, F2(1,17) However, reactions times to unprefixed nouns ers to be added to the network, a certain pro- the PERSON and ARTIFACT supersenses, but only = 5,98, p = 0,026) and for prefixed nouns are not influenced by the summed frequency of portion of part-of relations in JDM are actually the latter fits in the pair cabinet/drawer). (F1(1,23) = 21,27, p < 0,001, F2(1,11) = 7,88, p their prefixed counterparts. This proves that con- hypernymys (sucette/bonbon ’lollipop/candy’), 3.3 The distributional model = 0,017) . Average RTs for different groups are nections through derivational links matter. Pre- synonyms (chef /patron ’chief/boss’) or the- 4 shown in Tables 4a and 4b. fixed deverbal nouns are derived from prefixed matic associations (oceanographie´ /eau ’oceanog- We use a DSM generated from the frWaC corpus verbs, not from unprefixed nouns (porodit’(v) ‘to raphy/water’). Two possible explanations for (Baroni et al., 2009) – a 1.6 billion words corpus group av. F corresp. unpref. av. RT give birth, to generate’ → poroždenie(n) ‘pro- these confusions are the lack of linguistic expertise of French web pages. duction’, not roždenie(n) ‘birth’ → porožde- verb from Exp.1 of the players or a misunderstanding of the instruc- Words in the DSM appear at least 20 times in 1 16,3 low summed F 707,3 nie(n) ‘production’). Phonologically, prefixed tion. Erroneous relations were manually removed the corpus and in at least 5 different contexts. 2 2,0 high summed F 746,0 nouns resemble unprefixed ones much more than Syntactic dependencies were used as contexts prefixed verbs, but this does not play a role. from the set. Table 4a. Average RTs for verb stimuli in Exp.2. using the Talismane parser (Urieli, 2013). Rela- In total, our results can be taken as a piece of One interesting characteristic of JDM part-of tions taken into account in the context vectors are group av. F corresp. unpref. av. RT evidence for a new type of frequency infor- relations is that a considerable number of them noun from Exp.1 mation to be taken into account. Somewhat simi- do not fit into traditional typologies of meronymy the subject, object and modifier relations. Prepo- 1 12,4 low summed F 688,4 lar conclusions were reached by Moscoso del relations. For example, topological inclusions sitions and coordinating conjunctions are also in- 2 76,4 high summed F 657,5 Prado Martín et al. (2004) who studied morpho- (cell/prisoner), attachment relations (ear/earring) cluded as relations (the label of the relation being Table 4b. Average RTs for noun stimuli in Exp.2. logical family size effects in Finnish compared to or ownership (millionaire/money) are very com- the preposition or the coordinating conjuction). Dutch and Hebrew. mon among JDM part-of pairs although they are The weighting of the contexts was made using The results are indicative of the whole word Of course, many things remain to be explored. considered to be non-meronymic relations (Win- the pointwise mutual information and the cosine lexical access. We can conclude that prefixed As we noted earlier, we did not look at suffixa- measure was used to compute the similarity be- ston et al., 1987). verbs influence lexical access to their unprefixed tion. We did not specify the mechanisms by After filtering the pairs whose members do not tween the context vectors. The minimum similar- counterpart not through decomposition, but be- which derivationally related forms are connected appear in our DSM and removing most of the er- ity threshold has been set to 0.02. The total num- cause they are closely connected in the mental in the mental lexicon and how these connections roneous relations, there were 24 089 part-of pairs ber of word pairs in the DSM is 3 674 254. lexicon due to direct derivational links. are formed. In the connectionist approach where left in our dataset. no decomposition is assumed, regular connec- 4 Results and discussion 3 Conclusion tions between words’ phonological forms and meanings should matter. In dual route models, it 3.2 Sense tagging We then measure the proportion of semantically- Using Russian prefixed and unprefixed verbs, we can be suggested that decomposition normally In a previous study (Morlane-Hondere` and Fabre, annotated part-of pairs – sense pairs – in our set demonstrated that a group of semantically trans- does not win in some cases like derived verbs which are present in the DSM. Sense pairs which parent derivates influence the recognition of the 2012), we manually annotated the different and nouns we analyzed, but still takes place. occur less than 100 times in the dataset are dis- word they are derived from. The higher is meronymic sub-relations – following Winston and Then only the existence of a direct derivational carded. Table 1 provides the list of the 22 re- summed frequency of derivates, the faster is the Chaffin (1987)’s typology – in a dataset like the link and, probably, semantic transparency should lexical access. One could argue that this is due to one described above. The idea was to test whether 3http://wordnet.princeton.edu/man/ really matter. decomposition.We showed that this is not the there is a correlation between the nature of the re- lexnames.5WN.html To solve these and other problems, many cru- 4 case. Provided by Franck Sajous from the CLLE-ERSS labo- cial questions need to be answered. Which deri- 2http://www.jeuxdemots.org/ ratory.

138 47

maining sense pairs and, for each one, the ratio of holonym/meronym % holonym/meronym % TIME/TIME 84 ARTIFACT/PERSON 32.6 the computer screen for 500 ms or until a re- conclusive. We chose deverbal nouns for our part-of pairs present in the DSM. In this section, LOC./LOC. 78.3 ARTIFACT/ARTIFACT 31.4 sponse button was pressed. If no button was experiment to find enough relatively transparent we describe the homogeneous sense pairs – whose SUBST./SUBST. 62.4 ARTIFACT/LOC. 24.8 pressed, participants saw a blank screen for up to prefixed and unprefixed ones, and, if prefixed semantic classes are identical – and the heteroge- OBJECT/OBJECT 61 ARTIFACT/PLANT 22.8 COMM./COMM. 53.8 ARTIFACT/SUBST. 20.4 2 s. After a response was given or after these ones are decomposed, the system should go to neous ones, then we provide a detailed analysis of GROUP/PERSON 52.8 OBJECT/ANIMAL 19.8 2,5 s were over, an interstimulus interval was the prefixed verb by stripping the suffix rather some of the PERSON/BODY meronyms which have LOC./ARTIFACT 46.8 PLANT/PLANT 19.7 initiated and then the next trial began. than to the unprefixed noun by stripping the pre- been extracted by the DSM. BODY/BODY 40.5 GROUP/ANIMAL 17.1 ANIMAL/ANIMAL 41 PERSON/ARTIFACT 16.5 Results and discussion. We analyzed partici- fix (rodit’(v) → porodit’(v) → poroždenie(n)). ARTIFACT/COMM. 39.9 ANIMAL/BODY 9.4 pants’ question-answering accuracy and reaction To refute this alternative explanation, we de- 4.1 Homogeneous sense pairs ACT/ARTIFACT 35.8 PERSON/BODY 5.5 times. All participants gave at least 85% of cor- signed a follow-up experiment. As expected, part-of relations composed of two rect answers (92,4% on average); trials with in- Table 1: Part-of sense pairs and their presence in words of the same class are the most repre- correct answers were excluded from further the DSM. 2.2 Experiment 2 sented in the DSM. 84 % of the TIME/TIME analysis. We also discarded all RTs that exceed- part-of pairs were extracted by the DSM. This ed 1,5 s, as is customary in many such studies Method. The method was the same as in Exper- can be explained by the fact that the mem- acier – as well as fer – is used as a material, the (e.g. Alegre & Gordon, 1999). In total, 0,3% of iment 1. Participants were 24 speakers of Rus- bers of pairs like mois/jour ‘month/day’ both representation of carbone that emerges from the reactions to real stimuli were discarded. sian (age: 18-55 years, 18 female). Materials in- We demonstrated that RTs for verbs differ appear in contexts involving temporal prepo- corpus is that of a chemical element. cluded 60 prefixed verb and noun stimuli and 60 significantly depending on the summed frequen- nonce words. Real words were chosen from the sitions like venir ILYA ‘to come SINCE’, se cy of corresponding prefixed (and postfixed) pool of prefixed verbs and nouns whose unpre- derouler´ DURANT ‘to take place DURING’ or verbs (repeated measures ANOVA, F1(2,52) = 4.2 Heterogeneous sense pairs fixed counterparts were analyzed in Experiment scrutin AVANT ‘election BEFORE’. 8,66, p = 0,001, F2(2,34) = 4,99, p = 0,013), but 1. This time both verbs and nouns were grouped Likewise, the spatial dimension plays a crucial At the other end of the scale, part-of relations com- RTs for nouns do not. Average RTs for different in pairs. They were matched in length, CV struc- role in the extraction of meronyms (78.3 % of posed of two words of different classes are – also groups of verbs and nouns are given in Tables 2a ture and the frequency of their unprefixed coun- LOCATION/LOCATION pairs are extracted). This logically – the less represented in the DSM. and 2b. terparts, but differed in whole word frequency. is due to the fact that, as for time, spatial infor- Part-of pairs composed of words that refer to An example is given in Table 3. mation can be conveyed by specific prepositions. human beings or to animals and their body parts group av. F av. summed F of av. RT Thus, LOCATION /LOCATION meronyms’ shared are barely present in the DSM (although being (ipm) prefixed words (ms) word letters word unpre- group contexts massively involve the DANS ‘IN’ relation. the most frequent sense pairs in our dataset). In 1 40,1 11,0 643,4 (in Cy- F fixed SUBSTANCE pairs are the third best-extracted frWaC, PERSON words appear as subjects of ac- 2 41,1 43,5 632,3 rillic) (ipm) word F kind of pairs. The reason why 37.6 % of them has tion (prendre ‘to take’, dire ‘to say’) or cognitive 3 41,1 139,0 607,6 podyšat’ 8 7,7 90,8 1 not been extracted can be illustrated by the com- verbs (vouloir ‘to want’, savoir ‘to know’). They Table 2a. Average RTs for verb stimuli in Exp.1. to breath a little parison of acier ‘steel’ and two of its meronyms, are frequently modified by nationality adjectives. group av. F av. summed F of av. RT otplatit’ 9 1,7 89,0 2 namely fer ‘iron’ – which was extracted in the Body parts do not appear in such contexts. The (ipm) prefixed words (ms) to pay back DSM – and carbone ‘carbon’ – which was not ex- class of body parts was actually found to be quite 1 33,1 60,3 637,8 poroždenie 10 5,1 98,5 1 production tracted: heterogeneous, in that body parts’ distributions in 2 31,9 220,6 635,4 the corpus differ from persons’, but not in the same projavlenie 10 45,3 94,3 2 1. acier and fer both appear in contexts Table 2b. Average RTs for noun stimuli in Exp.1. manifestation way: like grille EN ‘grille COMP’, forge´ MOD Table 3. An example of stimuli for Exp.2. We believe that these results can be explained ‘forged MOD’ or lame DE ‘blade COMP’. organ nouns mostly appear in noun com- • as follows. The majority of Russian prefixed Moreover, we took care of the following. If Thus, they appear as materials and, moreover, pounds to indicate the location of medical in- verbs and nouns are likely to be stored as a verbs like podyšat’ ‘to breath a little’ and ot- as materials which are used to build the same terventions (radiographie DE ‘x-ray MOD’) whole because even relatively transparent ones platit’ ‘to pay back’ from Table 3 are accessed as kind of things; or affections (cancer de ‘cancer COMP’ or tend to have some aspects of meaning that cannot a whole, their word frequency should matter, and lesion´ de ‘injury COMP’); be predicted compositionally. Still, prefixed 2. although being a material as well, carbone podyšat’ (group 1) will be accessed faster. Now verbs have close connections with their unpre- does not appear as such in the corpus. Rather, limb nouns are modified by adjectives related let us assume that they are decomposed, and so • fixed counterpart in the mental lexicon due to are many other prefixed verbs. Then not the its contexts are chemical compounds like to location and are objects of verbs like lever direct derivational links and therefore influence word frequency of dyšat’ ‘to breath’ and platit’ monoxyde DE ‘monoxide COMP’. It is also ‘to raise’ or etendre´ ‘to stretch’. lexical access to it. Prefixed deverbal nouns are ‘to pay’ will predict the speed of the lexical ac- modified by adjectives like inorganique MOD not connected to their unprefixed counterpart in a All these contexts are obviously incompatible with cess, but the frequency of these unprefixed verbs ‘inorganic MOD’, which describe chemical similar way due to the lack of derivational links, plus the summed frequency of their decomposed PERSON words. properties of carbone. These two kinds of so the summed frequency of such nouns does not derivates. As Table 1 shows, this value is greater A similar distributional discrepancy can be ob- contexts are not found among acier’s. influence lexical access to it. for platit’ than for dyšat’, so otplatit’ (group 2) served with the ANIMAL/BODY sense pair, ex- However, an alternative explanation can also will be accessed faster. This was true for all other So, we can see that there is a discrepancy between cept that animal nouns tend to appear in contexts be suggested: prefixed verbs are decomposed prefixed verb pairs in Experiment 2, so the whole the contexts in which acier appears in the corpus like elevage´ DE ‘farming COMP’ or espece` DE (and thus boost the frequency of their unprefixed word access and decomposition scenarios always and the ones in which carbone appears: whereas ‘species COMP’. They are also modified by size counterpart), while the results for nouns are in- gave different predictions.

48 137 adjectives. It is interesting to note that many and the holonym are quite random. For ex- Grouping morphologically complex words in the mental lexicon: animal body parts like teteˆ DE ‘head COMP’, ample, the meronyms homme/main ‘man/hand’ Evidence from Russian verbs and nouns peau DE ‘skin COMP’ or queue DE ‘tail COMP’ share contexts like nu MOD ‘bare MOD’ or dos DE do appear among the closest contexts of animal ‘back COMP’, which are not very informative nouns. This means that the meronymic relation about their relation. On the other hand (!) some Natalia Slioussar Anastasia Chuprina between nouns referring to animals and their body shared contexts like doigt DE ‘finger COMP’ and HSE Moscow & St. Petersburg St. Petersburg State University parts is not a paradigmatic one. Thus, it is rea- saisir SUJ ‘to grab SUBJ’ are more informative. State University sonable to say that, in order to extract this particu- The fact that these specific features are shared by [email protected] [email protected] lar relation, the use of syntagmatic patterns would the meronyms indicates some kind of similarity be a better strategy than the use of a paradigmatic between them: when a man grabs a rock, it is ac-

DSM. tually his hand that completes the action of grab-

The sense pair GROUP/PERSON also presents bing, as well as a man’s fingers are also his hand’s 27 speakers of Russian (age: 19-52 years, 20 fe- an interesting situation. Of all the heterogeneous fingers. 1 Introduction male). Materials were 18 triplets of unprefixed sense pairs, meronymic relations belonging to this imperfective verbs and 12 pairs of unprefixed The meronyms enfant/oeil ‘child/eye’ also Frequency is known to play a crucial role in lexi- one are the most likely to be extracted by the distri- deverbal nouns. Word frequency, length and CV share some interesting contexts: both the cal access. The notions primarily discussed in the structure were matched inside triplets and pairs, butional method. This can be explained by a ten- meronym and the holonym are subjects of verbs of literature are form frequency, (whole) word fre- while the summed frequency of the correspond- dency to use the GROUP entities in a metonymic visual perception like regarder ‘to look’, percevoir quency and morpheme frequency, e.g. root fre- ing prefixed verbs and nouns was different for way: although an army is not the same kind of ‘to perceive’ or observer ‘to observe’. The quency. In numerous studies (Alegre & Gordon, every verb and noun inside a triplet/pair (as thing as a soldier, both words share contexts like metonymic interpretation is quite straightforward: 1999; Baayen & al. 2007, a.m.o.), these charac- shown in Table 1). Word frequency information tirer SUJ ‘to shoot SUBJ’ or tue´ PAR ‘killed BY’. although the eye is the child’s part that allows him teristics were manipulated to find out whether was taken from the The Frequency Dictionary of Another reason is the transitivity of properties like various word forms are decomposed during lexi- to look/perceive/observe, this ability is extended the Modern Russian Language (Lyashevskaya & nationality: armee´ ‘army’ and soldat ‘soldier’ are cal access or are stored and can be accessed as a to the whole child. Sharoff, 2009). whole. Similar issues arise when we turn from both modified by nationality adjectives because This phenomenon partially explains why such inflection to derivation, at least with semantically usually, members of the armed forces of a nation meronyms share semantic – thus distributional – word letters word summed group transparent derivates (Niswander-Klement & have to be citizens of this nation. features and are more likely to be extracted with a (in Cy- F F of Pollatsek, 2006; Taft 2004, a.m.o.). In the section 2, we mentioned the fact that rillic) (ipm) prefixed DSM. three meronyms of orchestra were present among words 2 Our study its ten most distributionally similar words in Ba- torčat’ 7 86,3 2,0 1 5 Conclusion Some morphologically complex words were to stick out roni and Lenci (2010)’s DSM. In our data, the shown to be accessed as a whole (then their own dyšat’ 6 90,8 29,4 2 meronyms orchestre/musicien have also been ex- The main goal of this study is to shed light on frequency played a crucial role), the others were to breath tracted: as for army and soldier, these words demonstrated to be decomposed (then root fre- platit’ 7 89,0 86,3 3 share semantic features. They are related to the linguistic phenomena at work in DSMs. By quency and the frequency of the word they are to pay the kind of music a musician and an orches- comparing a set of sense-tagged part-of relations derived from was important). Both options are roždenie 8 98,5 35,8 1 tra can play (classique MOD ‘classical MOD’, and a distributional model, we show that the se- birth available in some models: the one that is more traditionnel MOD ‘traditional MOD’ or jazz DE mantic class of the meronyms has a dramatic in- javlenie 7 94,3 297,5 2 fluence on their probability to be extracted by a efficient in a particular case wins. However, the apparition ‘jazz MOD’), the kind of actions they perform (in- picture may be more complex in morphologically Table 1. An example of stimuli for Exp.1. terpret´ e´ PAR ‘performed BY’, accompagne´ PAR DSM. We also highlight the – positive – influence rich languages. If a word has many inflectional ‘accompanied BY’) or their nationality. of metonymy in the extraction of heterogeneous forms or derivates that are stored as a whole, It is important to note that prefixed verbs are meronyms. they probably form groups, and lexical access to derived from unprefixed ones, while prefixed 4.3 Focus on the PERSON/BODY sense pair These results show that the part-of relation is this word may depend on the properties of such deverbal nouns are not (they are derived from not a monolithic entity but a collection of different groups. Our hypothesis is that if a word has a prefixed verbs). For verbs, we also counted deri- In the previous subsection, we saw that meronyms kinds of relations between different kinds of words large group of morphologically complex deri- vates with the reflexive postfix -sja. We made a belonging to the PERSON/BODY are the least likely which may or may not be distributionally similar. vates which are relatively semantically transpar- simplification not taking suffixes into account to be extracted with the distributional approach. In ent, access to and storage of this word would de- because, firstly, suffixes change the inflectional this subsection, we provide further insight into this pend on the properties of this group even though class the word belongs to and often cause stress result by examining the nature of the few PER- Acknowledgments the derivates do not necessarily undergo the pro- shifts and various alternations, and, secondly, SON/BODY meronymic pairs that were success- cess of decomposition. We explored this ques- most unprefixed verbs have dramatically more fully extracted. This work was partially supported by the ANSM tion in our study on Russian. derivates created by prefixation than by suffixa- The examination of the 5.5 % of PER- (French National Agency for Medicines and tion. 2.1 Experiment 1 Health Products Safety) through the Vigi4MED In total, every participant saw 54 verbs in in- SON/BODY meronymic pairs that were success- project under grant #2013S060. Method. We conducted a lexical decision exper- finitive and 24 nouns in nominative singular fully extracted is disappointing: the vast ma- iment using E-Prime software. Participants were form, and 78 nonce stimuli. They were shown on jority of the contexts shared by the meronym Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

136 49 Mathieu Lafourcade. Making people play for Lexi- References orthographic segmentation in visual word recogni- cal Acquisition with the JeuxDeMots prototype. In phological structure in the word recognition pro- Marco Baroni, Silvia Bernardini, Adriano Ferraresi, SNLP’07: 7th International Symposium on Natu- cess, especially in children. tion. Psychonomic Bulletin & Review, 11 , 1090- and Eros Zanchetta. The WaCky wide web: a col- ral Language Processing, page 7, Pattaya, Chonburi, 1098. lection of very large linguistically processed web- Thailand, December 2007. Reference crawled corpora. Language Resources and Evalua- Schroeder, S., Würzner, K.-M., Heister, J., Geyken, tion, 43(3):209–226, 2009. Franc¸ois Morlane-Hondere.` Une approche linguistique Andrews, S., & Davis, M. H. (1999). Interactive acti- A., & Kliegl, R. (2014). childLex: A lexical data- de l’evaluation´ des ressources extraites par analyse vation accounts of morphological decomposition: base for German read by children . Manuscript Marco Baroni and Alessandro Lenci. Distributional distributionnelle automatique. PhD thesis, Univer- Finding the trap in mousetrap? Brain and Lan- submitted for publication. Max Planck Institute for memory: A general framework for corpus-based site´ de Toulouse II le Mirail, 2013. guage, 68 , 355-361. Human Development, Berlin. semantics. Computational Linguistics, 36(4):673–

721, 2010. ´ Andrews, S., & Lo, S. (2013). Is morphological prim- Franc¸ois Morlane-Hondere` and Cecile´ Fabre. Etude Taft, M. (2003). Morphological representation as a des manifestations de la relation de meronymie´ dans ing stronger for transparent than opaque words? It Marco Baroni and Alessandro Lenci. How we correlation between from and meaning. In E. As- une ressource distributionnelle. In Proceedings of depends on individual differences in spelling and BLESSed distributional semantic evaluation. GEMS sink & D. Sandra (Eds.) Reading complex words TALN 2012, Grenoble, France, June 2012. vocabulary. Journal of Memory and Language, 68 , 2011, pages 1–10, 2011. (pp. 113-137). Amsterdam, The Netherlands: 279-296. Lynne Murphy. Semantic Relations and the Lexicon. Kluwer. Matthew Berland and Eugene Charniak. Finding parts Cambridge University Press, New York, 2003. Beyersmann, E., Casalis, S., Ziegler, J. C., & Grain- in very large corpora. In Proceedings of the 37th An- ger, J. (2014). Language proficiency and morpho- Weiß, R. H. (1998). Grundintelligenztest Skala 2 nual Meeting of the Association for Computational Patrick Pantel and Marco Pennacchiotti. Espresso: orthographic segmentation. Psychonomic Bulletin (CFT 20) mit Wortschatztest (WS) und Zahlenfol- Linguistics on Computational Linguistics, ACL ’99, Leveraging generic patterns for automatically har- Review. gentest (ZF). Göttingen: Hogrefe. pages 57–64, Stroudsburg, PA, USA, 1999. Associ- vesting semantic relations. In Proceedings of the ation for Computational Linguistics. Beyersmann, E., Castles, A., & Coltheart, M. (2012). 21st International Conference on Computational Morphological processing during visual word Alexander Budanitsky and Graeme Hirst. Evaluating Linguistics and the 44th Annual Meeting of the As- sociation for Computational Linguistics, ACL-44, recognition in developing readers: Evidence from WordNet-based Measures of Lexical Semantic Re- masked priming. The Quarterly Journal of Exper- latedness. Computational Linguistics, 32(1):13–47, pages 113–120, Stroudsburg, PA, USA, 2006. As- imental Psychology, 65 , 1306-1326. March 2006. sociation for Computational Linguistics.

D. Alan Cruse and William Croft. Cognitive lin- Quentin Pradet, Gael¨ de Chalendar and Jeanne Bague- Grainger, J., & Ziegler, J. C. (2011). A dual-route guistics. Cambridge: Cambridge University Press, nier Desormeaux. WoNeF, an improved, expanded approach to orthographic processing. Frontiers in 2004. and evaluated automatic French translation of Word- Psychology, 45 , Net. In GWC 2014, Tartu, Estonia, 2014. Christiane Fellbaum, editor. WordNet An Electronic Philip Resnik. Selection and Information: a Class- Lemhöfer, K., & Broersma, M. (2012). Introducing Lexical Database. The MIT Press, Cambridge, MA; LexTALE: A quick and valid Lexical Test for Ad- London, May 1998. Based Approach to Lexical Relationships. PhD the- sis, The Institute For Research In Cognitive Science, vanced Learners of English. Behavior Research Olivier Ferret. Typing relations in distributional the- University of Pennsylvania, 1993. Methods, 44, 325-343. sauri. In Nuria´ Gala, Reinhard Rapp, and Gemma Bel-Enguix, editors, Language Production, Cogni- Magnus Sahlgren. The distributional hypothesis. Riv- Longtin, C. M., & Meunier, F. (2015). Morphological tion, and the Lexicon, volume 48 of Text, Speech and ista di Linguistica, 20(1):33–53, 2008. decomposition in early visual word processing. Language Technology, pages 113–134. Springer In- Journal of Memory and Language, 53 , 26-41. Assaf Urieli. Robust French syntax analysis: recon- ternational Publishing, 2015. ciling statistical methods and linguistic knowledge Moll, K., & Landerl, K. (2010). SLRT-II: Lese- und Darja Fiserˇ and Benoˆıt Sagot. Combining multiple re- in the Talismane toolkit. PhD thesis, Universite´ de Rechtschreibtest: Weiterentwicklung der Salzbur- sources to build reliable wordnets. In TSD 2008 - Toulouse II le Mirail, 2013. ger Lese- und Rechtschreibtests (SLRT): Manual. Text Speech and Dialogue, Brno, Czech Republic, Bern: Verlag Hans Huber. 2008. Grady Ward. Moby Thesaurus List (English),. 2002. Morris, J., Porter, J. H., Grainger, J., & Holcomb, P. J. Julie Weeds. Measures and Applications of Lexical Roxana Girju, Adriana Badulescu, and Dan Moldovan. (2011). Effects of lexical status and morphological Automatic discovery of part-whole relations. Com- Distributional Similarity. PhD thesis, Department of Informatics, University of Sussex, 2003. complexity in masked priming: An ERP study. put. Linguist., 32(1):83–135, March 2006. Language & Cognitive Processes, 26 , 558-599.

Iryna Gurevych and Elisabeth Wolf. Expert-Built M. E. Winston, R. Chaffin, and D. Herrmann. A tax- Quémart, P., Casalis, S., & Colé, P. (2011). The role onomy of part-whole relations. Cognitive Science, and Collaboratively Constructed Lexical Semantic of form and meaning in the processing of written 11(4):417–444, December 1987. Resources. Language and Linguistics Compass, morphology: A priming study in French develop- 11(4):1074–1090, 2010. ing readers. Journal of Experimental Child Psy- chology, 109 , 478-496. Zellig Harris. Distributional structure. Word, 10(23):146–162, 1954. Rastle, K., & Davis, M. H. (2008). Morphological Kow Kuroda, Jun’ichi Kazama, and Kentaro Torisawa. decomposition based on the analysis of orthogra- A look inside the distributionally similar terms. In phy. Language & Cognitive Processes, 23, 942- Proceedings of the Second Workshop on NLP Chal- 971. lenges in the Information Explosion Era (NLPIX 2010), pages 40–49, Beijing, China, August 2010. Rastle, K., Davis, M. H., & New, B. (2004). The Coling 2010 Organizing Committee. broth in my brother’s brothel: Morpho-

50 135 Modeling Lexical Effects in Language Production: Where Have We

RT Gone Wrong? Prime Type Adults Children Stimulus Example

All participants

Suffixed Word 593 (12) 1024 (36) kleidchen - KLEID Ting Zhao Victoria A. Murphy Suffixed Nonword 597 (12) 1051 (38) kleidtum - KLEID Department of Education Department of Education Nonsuffixed Nonword 614 (13) 1045 (38) kleidekt - KLEID University of Oxford University of Oxford Unrelated 629 (14) 1087 (41) träumerei - KLEID [email protected] [email protected]

Higher Language Proficiency (+1SD)

Suffixed Word 588 (12) 900 (28) kleidchen - KLEID 1 Introduction 2 Lexical characteristics that contribute Suffixed Nonword 583 (12) 924 (30) kleidtum - KLEID to the speed of spoken production Nonsuffixed Nonword 602 (12) 903 (28) kleidekt - KLEID Words have their own conceptual representations,

Unrelated 620 (13) 974 (33) träumerei - KLEID semantic properties, and physical forms. These This study considers three lexical layers (i.e.

Lower Language Proficiency (-1SD) lexical characteristics not only set words apart as Meaning, Form, and Usage), each of which is a distinct item in the lexical repertoire but also underpinned by its own manifest indicators. The

Suffixed Word 599 (12) 1189 (48) kleidchen - KLEID provide valuable insight into the processes and lexical variables under examination all have been Suffixed Nonword 613 (13) 1218 (51) kleidtum - KLEID mechanisms of language production. found to significantly influence the speed of lex- Nonsuffixed Nonword 626 (14) 1239 (52) kleidekt - KLEID Over the past decades there has been a large ical processing, as will be briefly reviewed below.

Unrelated 638 (14) 1229 (51) träumerei - KLEID body of research examining how word meaning, Meaning. (1) Word concreteness (WC): A form, and usage directly affect the speed of main difference between concrete and abstract Table 1. Response times (in ms) for children and adults, monolingual speakers’ production (e.g. Alario et words lies in the existence of sensorimotor at- averaged across items for each participant. Standard errors are presented in parentheses. al., 2004; Barry, Morrison, & Ellis, 1997; Bates tributes of the former. A number of studies have et al., 2003; Bonin, Chalard, Méot, & Fayol, revealed that concrete words exhibit preferential 2002). Of note, almost all these studies have processing relative to abstract words (e.g. De with lower language proficiency (-1SD), z=1.16, Moreover, the pattern of priming generalizes to failed to accommodate the fact that word usage, Groot, 1992; Jin, 1990; Schwanenflugel & Akin, p=.25. beginning readers with higher language profi- given it is a behavioral outcome (Zevin & 1994). (2) Word typicality (WT): The degree of a For children, proficiency played an even ciency: they show priming similar to that of Seidenberg, 2002, 2004), likely mediates the re- lexical item’s typicality depends upon how many more pronounced role than for adults: higher higher proficient adults. For children with lower lationship between meaning/form and spoken attributes that it shares with other members of the proficiency children (+1SD) showed the same language proficiency, the effects did not reach production. Moreover, lexical characteristics same category. Typical items are usually pro- pattern as higher proficiency adults, namely significance, but were clearly most pronounced have been predominantly examined as discrete cessed more accurately and faster relative to priming from all related condition, z=3.03, in the suffixed word condition. variables in the literature, but in fact, some of atypical items in a range of tasks (e.g. Bjorklund z=2.02, z=2.96, all p<.05. In contrast, in lower We argue that there is a developmental them may correspond to the same layer of lan- & Thompson, 1983; Jerger & Damian, 2005; proficiency children (-1SD) priming in none of gradient in the use of morphological information guage production or the same aspect of lexical Southgate & Meints, 2000). (3) Semantic neigh- the conditions reached significance, although during reading acquisition, driven by language knowledge. Additionally, little work has been borhood density (SND): Words with high SND there was a numerical advantage from suffixed proficiency. Beginning readers with low lan- done on children’s emerging bilingual lexical are characterized by having a great deal of se- word primes in the mean reaction times (40ms guage proficiency are only able to benefit from representations, especially those learning an L2 mantic neighbors and low semantic distance, faster compared to the unrelated condition). morpho-semantic information, if at all. More within input-limited contexts, possibly due to the whereas low-SND words typically have few se- advanced lexical knowledge allows readers to fact that this population has only recently begun mantic neighbors and high semantic distance. 3 Conclusion extract morpho-orthographic information. Fol- to receive focused attention in the research field. The superiority of high SND over low SND lowing Andrews and Davis (1999) and Grainger Our results confirm recent evidence for French In order to delineate the exact manner in words for processing has been observed in lexi- and Ziegler (2011), we assume that this happens adults (Beyersmann et al., 2014), showing that which lexical effects come into play, the present cal decision, word naming, and semantic catego- through segmentation of the affix in lower profi- the extent to which morphological information is study used structural equation modeling to per- rization (e.g. Buchanan, Westbury, & Burgess, ciency adults, as indicated by the priming effects exploited depends on language proficiency also form a simultaneous test of the complex relation- 2001; Siakaluk, Buchanan, & Westbury, 2003; of both suffixed prime conditions. Crucially, in German. Adults in the present study showed ships among a variety of lexical variables and to Yates, Locker, & Simpson, 2003). (4) Number of higher proficiency adult and even child readers morphological priming effects from suffixed assess their direct, indirect and total effects on related senses (NoRS): Many words are polyse- with sophisticated lexical knowledge are able to word primes ( kleidchen-KLEID ), suffixed L2 lexical processing efficiency. Furthermore, mous in terms of having several different but additionally use segmentation of the embedded pseudoword primes ( kleidtum-KLEID ) and also attempts were also made to estimate and then to related senses. Compared to monosemous words, stem, therefore showing facilitation also in the nonsuffixed pseudoword primes ( kleidekt- compare three types of hypothesized models, in polysemous words exhibit preferential pro- nonsuffixed prime condition. Our results high- KLEID ) relative to unrelated words ( träumerei- which the lexical relationships were specified cessing in a variety of tasks (e.g. Beretta, light the importance of lexical knowledge as a KLEID ). Priming from the nonsuffixed differently with respect to spoken production in Fiorentino, & Poeppel, 2005; Klepousniotou & further determinant of the ability to exploit mor- pseudoword condition did not continue to be sig- L2 learners. Baum, 2007; Lichacz, Herdman, Lefevre, & nificant with decreasing language proficiency. Baird, 1999). Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

134 51 Form. Word length can be measured ortho- orded using the Audacity software, and then guage proficiency, as indexed by vocabulary and matched on length. For each target all four types graphically (i.e. NoL: number of letters) or pho- manually calculated for analysis. spelling skills, on morphological priming can be of primes were created. Primes were matched on nologically (i.e. NoP: number of phonemes and Norms of lexical variables. The values of WC, replicated with German adults and whether it length, suffix length and non-morphemic ending NoS: number of syllables). The presence of WT, and SWF were rated by the participants on generalizes to readers at the lower end of the pro- length across conditions. Four counterbalanced length effects has been reported in several previ- Likert scales. The values of other lexical varia- ficiency range, namely children. We expect to lists of prime-target combinations were created, ous studies (e.g. Alario et al., 2004; Cuetos, Ellis, bles were obtained from psycholinguistics data- see evidence for a more automatized form of each containing a target only once, such that par- & Alvarez, 1999; De Groot, Borgwaldt, Bos, & bases such as the Irvine Phonotactic Online Dic- morpho-orthographic decomposition in highly ticipants saw each target only with one of the Van den Eijnden, 2002) although the predictive tionary (Vaden, Halpin, & Hickok, 2009) and the proficient children (replicating the Quémart et al. four prime conditions. power of each specific measure varies across Wordmine2 (Durda & Buchanan, 2006). pattern), whereas low-skilled children should research contexts possibly due to their examina- show less priming (as in Beyersmann et al., 2.3 Procedure tion of different languages (Bates et al., 2003). 3.2 Analytical strategies 2012) or no robust priming at all. In our adult Stimuli were presented in white 20-point Courier Usage. Usage is represented by subjective Structural equation modeling (SEM), which group, we expect robust priming in all three New font in the center of a black screen on a 15 ″ word frequency (SWF) and /or age of acquisition combines path analysis, confirmatory factor prime conditions (including the nonsuffixed con- laptop monitor with a refresh rate of 60 Hz. Each (AoA), both of which have been observed to sig- analysis, and analysis of structural models, was dition) in high proficiency participants, but re- trial consisted of a 500-ms fixation cross, fol- nificantly affect the speed of spoken production used to estimate the goodness-of-fit of three duced non-suffixed priming in low proficiency lowed by a 500-ms forward mask of hash keys, in such a way that individuals take less effort to types of hypothesized models. This analytical participants then a prime in lowercase for 50 ms, followed by access high-frequency and early-acquired words strategy, as an extension of multiple regression, the target in uppercase. The target remained on relative to low-frequency and late-acquired ones enables researchers to estimate not only the di- screen until response. Participants were instruct- (e.g. Balota, Cortese, Sergent-Marshall, Spieler, rect effects but also indirect effects that one vari- 2 Method ed to indicate whether the presented stimuli was & Yap, 2004; Barry et al., 1997; Morrison, Ellis, able has upon another. Moreover, SEM can be 2.1 Participants an existing German word or not by pressing a & Quinlan, 1992). AoA effects interact with fre- used to measure the proportion of variance ex- key as quickly and as accurately as possible. quency effects in such a way that the former is plained by the models proposed in the present Twenty-four university students (13 women, Mage They were not informed about the presentation partly dependent on the latter (Brysbaert & study so as to hold general implications for the = 25.2 years, age range: 20–29 years) and 24 el- of the prime. Ghyselinck, 2006). lexical processing system as a whole, although it ementary school children (13 girls, Mage = 9.5 should be acknowledged that this type of analy- years, age range: 8;6–10;9 years, grade 3-5) par- 2.4 Results 3 Methodology and analytical strategies sis might lack a specific focus on certain varia- ticipated in the experiment. Reaction times were analyzed using linear Each participant’s language proficiency 3.1 Methodology bles through purposeful manipulation of experi- mixed-effects modeling. Participants and items mental materials. Additionally, latent variables was assessed, using a spelling and a vocabulary were included as random factors and lexical sta- th Participants . Thirty-nine 5 grade children (aged are formed to manifest different dimensions that test. Adults performed a spelling recognition test, tus of the target (word, pseudoword), prime type 10-11 years) and 94 undergraduates (aged 17-20 are underpinned by their own indicators. In so which was modelled after the one used by An- (suffixed word, suffixed pseudoword, nonsuf- years) were recruited. All had Chinese as their doing, the present study moves away from the drews and Lo (2012). Participants were asked to fixed pseudoword, unrelated word), age group native language and English as their second. The examination of each lexical variable to that of classify 100 words as correctly or incorrectly (adults, children) and language proficiency (con- child sample had been learning English as a for- specified constructs and structural relations be- spelled. Children performed the fill-in-the-gap tinuous measure combined of the spelling and eign language for about 2.5 years, and the adult tween constructs, thus a better understanding of dictation test of the SLRT-II (Moll & Landerl, vocabulary scores), as well as all their interac- sample for approximately 10 years. the nature of lexical characteristics can be gained 2010). For assessment of vocabulary, adults tions, were included as fixed effects. Where ap- Stimuli. The experiment consisted of two at a more macro level. completed the German version of the LexTALE propriate, one-sided post-hoc contrasts were ap- blocks of stimulus words and one block of filler Conducting SEM typically involves six steps (Lemhöfer & Broersma, 2012), and children the plied comparing all related priming conditions words. Each block had 35 (in the child group) / (Kline, 2011): model specification, model identi- vocabulary subtest of the CFT 20 (Weiß, 1998). with the unrelated condition. For contrasting 66 (in the adult group) valid trials. The stimuli fication, select good measures, model estimation, A composite measure of spelling and vocabulary readers with higher and lower proficiency, reac- were selected from ten semantic categories in model evaluation and modification, and inter- was calculated by standardizing and averaging tion times of participants scoring one standard almost equal numbers. They were all presented preting and reporting results. Moreover, as rec- the spelling and vocabulary scores for each par- deviation above or below the mean proficiency in the same format over the course of the exper- ommended by Kline (2011), SEM was conducted ticipant. measure within their age group were used. Sig- iments. in two steps in the present study, that is, the 2.2 Materials nificance was evaluated using the normal distri- Procedures. The participants were tested indi- measurement models were validated in terms of bution. Results are reported for word targets on- vidually in a quiet room. They performed picture convergent validity, discriminant validity, and We conducted a masked priming lexical decision ly. Descriptive statistics are provided in Table 1. naming in L2 (English) and then L1 (Chinese)- reliability before the structural models proceeded experiment using real suffixed words ( kleidchen- For adults, priming was observed from all to-L2 (English) translation. As a stimulus ap- to be estimated. One last thing to note is that the KLEID ), suffixed pseudowords ( kleidtum- three related conditions (suffixed word, suffixed peared on the screen, the participants were asked data entered for analysis were lexical items. The KLEID ), nonsuffixed pseudowords ( kleidekt- pseudoword and nonsuffixed pseudoword) rela- to produce the L2 word as rapidly and accurately stimulus size in the adult group was considered KLEID ) and unrelated controls ( träumerei- tive to the unrelated condition, z=5.04, z=4.43, as possible. The SuperLab software (Cedrus sufficiently large for performing SEM analysis. KLEID ) as primes. 50 word targets were selected z=2.07, all p<.05. However, language proficien- Corporation, 2007) generated stimulus presenta- In order to reduce the complexity of the hypothe- from the childLex corpus (Schroeder, Würzner, cy moderated priming effects. Priming in the tions. Response latencies (RLs), defined as the sized model specifying children’s L2 lexical pro- Heister, Geyken, & Kliegl, 2014) and 50 nonsuffixed pseudoword condition was only sig- duration between the presentation of a stimulus cessing, composite variables rather than latent pseudoword targets were created by changing nificant for adults with higher language profi- and the initiation of a vocal response, were rec- variables were constructed to decrease the num- one letter from a real word that was not in the ciency (+1SD), z=1.74, p<.05, but not for adults target word set. Word and nonword targets were

52 133 Language proficiency moderates morphological priming in children ber of stimulus words required for this type of cies. Similar results held for adults’ L1-to-L2 analysis. translation (see Appendix C for details). and adults Three competing models were hypothesized and estimated to determine which one best fitted the data. The first model concerns only the direct relationship between the lexical variables, and Jana Hasenäcker Elisabeth Beyersmann Sascha Schroeder picture naming and translation latencies. The Max-Planck-Institute Laboratoire de Psychologie Max-Planck-Institute second model identifies word usage as a media- for Human Development, Cognitive, Aix-Marseille for Human Development, tor and examines the indirect effects of meaning Berlin Université and Centre de la Berlin and form variables on the recorded RLs. The hasenaecker@mpib- Recherche Scientifique, sascha.schroeder@mpib- third model considers both direct and indirect effects of word meaning and form on the out- berlin.mpg.de Marseille berlin.mpg.de come variable. To illustrate, an example of these lisi.beyersmann three types of hypothesized models that specify Model 1 @gmail.com the possible relationships between lexical varia- bles and the speed of adults’ picture naming is presented in Appendix A. The goodness of model fit was estimated ac- cording to six types of indices, including model Only few studies have been concerned with ��, CFI, RMSEA, AGFI, GFI, and NFI. A rule 1 Introduction morphological decomposition in beginning read- of thumb is that an RMSEA below .08 indicates ers. The few studies from English and French A number of studies have shown that skilled reasonable fit, and values greater than .90 for the used complex word primes, pseudocomplex readers decompose morphologically complex CFI, AGFI, GFI, and NFI suggest close approx- Model 2 word primes and non-morphological word words upon encountering them (for a review, see imate fit. SEM was run using IBM SPSS AMOS primes. Quémart, Casalis and Colé (2011) found Rastle & Davis, 2008). It has been proposed that v.20. priming in French grade 3, 5, and 7 children this segmentation process is early and automatic It should be noted that, before performing from complex as well as pseudocomplex words, and is driven by orthographic form, while being SEM analysis, the whole RL data set was thus suggesting that children already use adult- blind to semantic content, thus also called mor- screened for incorrect and omitted responses, like decomposition processes. In contrast, Bey- pho-orthographic (Rastle, Davis, & New, 2004; outliers (low cut-off: below 350ms, high cut-off: ersmann, Castles and Coltheart (2012) only Taft, 2003). One key finding in favor of this 3 SDs), and those participants and stimulus items found priming from truly complex words in proposition comes from masked morphological with an exceptionally high error rate. As conven- grade 3 and 5 English-speaking children, indicat- priming: the recognition of a target word is fa- tionally done, RLs were then averaged to gener- ing that morpho-orthographic priming is not au- cilitated when it is preceded by a morphological- ate a summary score for each lexical item, and Model 3 tomatized yet and decomposition relies more on ly related word prime ( teacher-TEACH ). Facili- these values were entered into final SEM analy- semantics in developing readers. However, no tation has also been found in a number of lan- sis. Figure 1: SEM results: Picture naming in adults studies with children have used complex guages for targets preceded by pseudocomplex pseudoword primes so far, although they provide word primes that is words that appear to have a 4 Results As regards children’s picture naming, the re- the possibility to utilize the paradigm in lan- morphologically complex structure, but are sim- sults presented in Appendix B shows that Model guages that do not naturally have pseudocomplex The model-fit indices of the three models under plex words ( corner-CORN ). Moreover, facilita- 3 reached a better model fit than Models 1 and 2. words, such as German. examination across two types of productive tasks tion has as well been observed from complex Moreover, Figure 2 indicates that Model 3 (38%) Morphological decomposition in German in both populations are presented in Appendix B. pseudoword primes, that is a non-existing com- explained more variance in naming speed than can be insightful to investigate, because of its Comparatively, the child and adult data could bination of a stem and affix (flexify-FLEX ). For Model 1 (36%) and Model 2 (24%). In addition, language specific characteristics. German has a best be modeled by the third model where word non-morphological nonword primes, that is a word usage, as represented by age of acquisition, transparent orthography and is morphologically meaning and form not only make direct but also non-existing combination of a word and a non- together with word typicality were found to sig- rich. As a consequence, morphological entities indirect contribution to the RLs. morphemic ending (flexint-FLEX ), mixed results nificantly and directly predict the naming speed might present a very useful unit for effective Take picture naming in adults as an example have been obtained (Longtin & Meunier, 2005; in Model 3. Similar results were observed with word recognition, even for beginning readers. (see SEM results in Appendix B and Figure 1), it Morris, Porter, Grainger & Holcomb, 2011). Re- children’s L1-to-L2 translation (see Appendix C Nevertheless, for children being still in the pro- is clear that Model 3 achieved a much better cent evidence from French points to a moderat- for details). cess of reading acquisition and showing more model fit than Model 1, and Model 3 explained ing role of language proficiency: the magnitude variability in their lexical representations, lan- more variance in naming latencies (59%) than to which morpho-orthographic information is guage proficiency can be expected to play an Model 1 (45%) and Model 2 (51%). Additionally, used increases as a function of individual vocab- even greater role than Beyersmann et al. (2014) among all the lexical variables included in Model ulary and spelling skills in adults (Andrews & found for adults. 3, only word usage was found to make a signifi- Lo, 2013; Beyersmann, Casalis, Ziegler & The aim of the present study was there- cant and direct contribution to the naming laten- Grainger, 2014). fore to test whether the moderating effect of lan- Model 1 Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage . Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

132 53 whether or not the same results still hold for oth- segmentation in visual word recognition. Psycho- er L2 learner types, particularly those whose L1s nomic Bulletin and Review 11 (6), 1090-1098. are not Sino-Tibetan languages, as well as for Rastle, K. & Davis, M.H. (2008). Morphological de- monolingual speakers needs to be further inves- composition based on the analysis of orthography. tigated. Importantly, examining these issues Language and Cognitive Processes, 23, 942-971. would allow us to gain a better understanding of Seidenherg, M.S. (1987). Sublexical structures in vis- Model 2 the nature of lexical characteristics by addressing ual word recogni- tion: Access units or orthograph- the issue of whether lexical effects are language- ic redundancy? In M. Coltheart (Ed.), Attention dependent or universal across languages. Second, and performance XII: Reading (pp. 245-263). not all the variance can be explained the included Hills- dale, NJ: Erlbaum. lexical variables, partly due to the fact that it Voga, M. & Grainger, J. (2004). Masked Morpholog- seems implausible to cover every possible fea- ical Priming with Varying Levels of Form Overlap: ture of a lexical item because of theoretical and Evidence from Greek Verbs. Current Psychology Model 3 practical considerations. Third, given the use of a Letters: Behaviour, Brain & Cognition, 13(2). non-experimental design, it would be difficult to Figure 2: Path analysis results: Picture naming in make unequivocal explanations of causality children among the variables of interest. To conclude, the model that considers both di- Taken together, these results indicate that rect and indirect effects of meaning and form on word usage does not exist independently of other L2 lexical processing efficiency may be superior lexical variables but rather mediates the impact to those that do not. As also observed in our of meaning and form on L2 children’s and adults’ study, word usage does play a mediating role in productive performance. In comparison, the indi- lexical processing, in part reflecting that ‘only in rect effects of meaning and form on L2 lexical the stream of thought and life do words have processing efficiency were found to be more no- meanings’ (Wittgenstein, 1967, p.31). ticeable with adults relative to with children. References 5 Discussion and conclusions Alario, F-Xavier, Ferrand, L., Laganaro, M., The present study uses SEM as a methodological New, B., Frauenfelder, U.H., & Segui, J. improvement to investigate the relationships be- (2004). Predictors of picture naming speed. tween a range of lexical variables and L2 lexical Behavior Research Methods, 36(1): 140-155. processing efficiency in both children and adults. Balota, D.A., Cortese, Michael J, Sergent- A comparison of the three different types of Marshall, Susan D, Spieler, Daniel H, & Yap, models indicates that word meaning and form MelvinJ. (2004). Visual word recognition of makes not only direct but also indirect contribu- single-syllable words. Journal of tion to the speed of L2 lexical processing, and Experimental Psychology: General, 133: 283- word usage likely mediates the extent to which 316. meaning and form influence the processing out- Barry, C., Morrison, C.M., & Ellis, A.W. (1997). come. Furthermore, a comparison between chil- Naming the Snodgrass and Vanderwart dren and adults suggests that the importance of pictures: Effects of age of acquisition, word usage tends to increase with age. frequency, and name agreement. The A note of caution thus should be raised when Quarterly Journal of Experimental interpreting the results of previous studies where Psychology, 50(3): 560-585. the mediating effects of word usage have not Bates, Elizabeth, D’Amico, Simona, Jacobsen, been adequately addressed. Accordingly, future Thomas, Székely, Anna, Andonova, Elena, research modeling lexical effects would be well Devescovi, Antonella, . . . Pléh, Csaba. (2003). advised to consider the indirect effect that word Timed picture naming in seven languages. meaning and form have on L2 learners’ produc- Psychonomic Bulletin & Review, 10(2): 344- tive performance via usage. 380. Although this study provides new insights into Beretta, A., Fiorentino, R., & Poeppel, D. (2005). how lexical variables are related to each other, The effects of homonymy and polysemy on there are several limitations that should be lexical access: an MEG study. Cognitive acknowledged. First, since this research focuses Brain Research, 24(1): 57-65. only on L2 learners within input-limited contexts,

54 131 nificant priming effects obtained with the Low Giraudo, H. & Grainger, J. (2000). Effects of Prime Bjorklund, David F, & Thompson, B.E. (1983). Kline, Rex B. (2011). Principles and Practice of frequency M-primes/High frequency targets, word frequency in masked morphological and or- Category typicality effects in children’s Structural Equation Modeling (3rd ed.). New suggests competition effects to the detriment of thographic priming. Language and Cognitive Pro- memory performance: Qualitative and York: Guilford Press. an obligatory decomposition process. According cesses, 15, 421-444. quantitative differences in the processing of Lichacz, F.M., Herdman, C.M., Lefevre, J.A., & to this view both low and high frequency targets Giraudo, H., & Grainger, J. (2001). Priming complex category information. Journal of Baird, B. (1999). Polysemy effects in word should have benefit from the prior presentation words: Evidence for supralexical representation of Experimental Child Psychology, 35(2): 329- naming. Canadian Journal of Experimental of a morphologically related word, but the results morphology. Psychonomic Bulletin & Review, 344. Psychology, 53(2): 189-193. revealed this was not the case. Only base targets 8(1), 127-131. Bonin, P., Chalard, M., Méot, A., & Fayol, M. Morrison, C.M., Ellis, A.W., & Quinlan, P.T. having a surface frequency lower than the sur- Giraudo, H. & Voga, M. (2007). Lexeme-Based (2002). The determinants of spoken and (1992). Age of acquisition, not word face frequency of their derivation were signifi- Model vs. Morpheme-Based Model from Psycho- written picture naming latencies. British frequency, affects object naming, not object cantly facilitated relative to both the orthographic linguistic Perspectives. In F. Montermini, G. Boyé, Journal of Psychology, 93(1): 89-114. recognition. Memory & Cognition, 20(6): and the unrelated conditions (+45 and +36ms). and N. Hathout (Eds.), Selected Proceedings of the Brysbaert, Marc, & Ghyselinck, Mandy. (2006). 705-714. We interpret these data as an evidence of a com- 5th Décembrettes: Morphology in Toulouse, pp. The effect of age of acquisition: Partly Schwanenflugel, P.J., & Akin, Carolyn E. (1994). petition process among the word forms at the 108-114. Somerville, MA: Cascadilla Proceedings frequency related, partly frequency Developmental trends in lexical decisions for word level. Project. independent. Visual Cognition, 13(7-8): 992- abstract and concrete words. Reading Giraudo, H. & Voga, M. (2014). Measuring morphol- 1011. Research Quarterly, 29(3): 251-264. References ogy: the tip of the iceberg? A retrospective on 10 Buchanan, L., Westbury, C., & Burgess, C. Siakaluk, P.D., Buchanan, L., & Westbury, C. years of morphological processing, Carnets de (2001). Characterizing semantic space: (2003). The effect of semantic distance in Amenta, S. & Crepaldi, D. (2012). Morphological Grammaire, 22, xxx-xxx. processing as we know it: An analytical review of neighborhood effects in word recognition. yes/no and go/no-go semantic categorization morphological effects in visual word identification. McCormick, S. F., Rastle, K. & Davis, M. H. (2009). Psychonomic Bulletin & Review, 8 (3): 531- tasks. Memory & Cognition, 31(1): 100-113. Frontiers in Language Sciences, 3, 232. Adore-able not adorable? Orthographic underspeci- 544. Southgate, V., & Meints, K. (2000). Typicality, fication studied with masked repetition priming. Cedrus Corporation. (2007). SuperLab 4.5. San naming, and category membership in young Aronoff, M. (1994). Morphology by itself. Cam- European Journal of Cognitive Psychology, 21, Pedro, CA. children. Cognitive Linguistics, 11(1/2): 5-16. bridge: MIT Press. 813-836. Cuetos, Fernando, Ellis, A.W., & Alvarez, B. Vaden, KI, Halpin, HR, & Hickok, GS. (2009). Booij, G. (2010). Construction Morphology. Oxford: McCormick, S. F., Brysbaert, M., & Rastle, K. (1999). Naming times for the Snodgrass and Irvine Phonotactic Online Dictionary, Version Oxford University Press. (2009). Is morphological decomposition limited to Vanderwart pictures in Spanish. Behavior 2.0. Retrieved January 30, 2013 from Boudelaa, S., & Marslen-Wilson, W. D. (2001). Mor- low-frequency words?. The Quarterly Journal of Research Methods, 31(4): 650-658. http://www.iphod.com/search/V2ListWords.html. phological units in the Arabic mental lexicon. Experimental Psychology, 62(9), 1706-1715. De Groot, A.M.B. (1992). Determinants of word Yates, Mark, Locker, Lawrence, & Simpson, Cognition , 81, 65-92. Marantz, A. (2013) No escape from morphemes in translation. Journal of Experimental Greg B. (2003). Semantic and phonological Duñabeitia, J.A., Perea, M., & Carreiras, M. (2007). morphological processing, Language and Cogni- Psychology: Learning, Memory, and influences on the processing of words and Do transposed-letter similarity effects occur at a tive Processes, 28:7, 905-916 Cognition, 18(5): 1001-1018. pseudohomophones. Memory & Cognition, 31(6): 856-866. morpheme level? Evidence for morpho- Marslen-Wilson, W.D., Ford, M., Older, L., & Zhou, De Groot, A.M.B., Borgwaldt, Susanne, Bos, orthographic decomposition. Cognition, 105, 691- X. (1996). The combinatorial lexicon: Priming der- Mieke, & Van den Eijnden, Ellen. (2002). Zevin, J.D., & Seidenberg, M.S. (2002). Age of 703. ivational affixes. In G. Cottrell (Ed.), Proceedings Lexical decision and word naming in acquisition effects in word reading and other Duñabeitia, J.A., Laka, I., Perea, M., & Carreiras, M. of the 18th Annual Conference of the Cognitive bilinguals: Language effects and task effects. tasks. Journal of Memory and Language, (2009). Is Milkman a superhero like Batman? Con- Science Society, (pp. 223-227). Mahwah, NJ: Law- Journal of Memory and Language, 47(1): 91- 47(1): 1-29. stituent morphological priming in compound rence Erlbaum Associates. 124. Zevin, J.D., & Seidenberg, M.S. (2004). Age-of- words. The European Journal of Cognitive Psy- New B., Pallier C., Ferrand L., Matos R. (2001). Une Durda, K., & Buchanan, L. . (2006). acquisition effects in reading aloud: Tests of chology, 21(4), 615 – 640. base de données lexicales du français contemporain WordMine2. Retrieved December 1, 2012. cumulative frequency and frequency Drews, E. & Zwitserlood, P. (1995). Morphological sur internet: LEXIQUE, L'Année Psychologique, from http://www.wordmine2.org trajectory. Memory & Cognition, 32(1): 31-38. and orthographic similarity in visual word recogni- 101, 447-462. Jerger, S., & Damian, M.F. (2005). What’s in a name? Typicality and relatedness effects in tion. Journal of Experimental Psychology: Human Rastle, K., Davis, M.H., Marslen-Wilson, W.D., & Perception & Performance, 21, 1098-1116. Tyler, L.K. (2000). Morphological and semantic children. Journal of Experimental Child Psychology, 92(1): 46-75. Forster, K.I., & Davis, C. (1984). Repetition priming effects in visual word recognition: A time-course and frequency attenuation in lexical access. Jour- study. Language and Cognitive Processes 15 (4-5), Jin, Y.S. (1990). Effects of concreteness on nal of Experimental Psychology: Learning, 507-537. cross-language priming in lexical decisions. Perceptual and Motor Skills, 70(3): 1139- Memory, and Cognition, 10, 680-698. Rastle, K., & Davis, M.H. (2003). Reading morpho- logically complex words: some thoughts from 1154. Frost, R., Deutsch, A., & Forster, K. I. (2000). De- Klepousniotou, E., & Baum, S.R. (2007). composing morphologically complex words in a masked priming. In S. Kinoshita & S.J. Lupker nonlinear morphology. Journal of Experimental (Eds.), Masked priming: State of the art (pp. 279- Disambiguating the ambiguity advantage Psychology: Learning, Memory and Cognition, 26, 305). New York: Psychology Press. effect in word recognition: An advantage for polysemous but not homonymous words. 751-765. Rastle, K., Davis, M.H., & New B. (2004). The broth in my brother’s brothel: Morpho-orthographic Journal of Neurolinguistics, 20(1): 1-24.

130 55 Appendix A. An example of the hypothesized models different timing for their stimulus onset asyn- general average and outliers were removed chrony (SOA). The SOAs of the original experi- (1.4%). Adult picture naming ments were of 57ms in Giraudo and Grainger (2000) and 42ms in McCormick, Rastle, & Davis Table 1: mean reaction times across the three (2009). priming conditions and the two targets condi- tions. Net priming effects are expressed in ms. 2 The study In order to disentangle such findings, the present RTs Net priming effects study was carried out using the same paradigm (U-M/O-M and U-O) and similar SOAs as the previous ones described. HighSF M 613 +45* / + 36* The main manipulation was to compare morpho- primes- primes logical facilitation when the frequency of the LowSF O 649 +9 complex words (used as primes) and their roots targets primes (used as targets) was inverted. More specifically U 658 we selected 60 base word targets from French, primes half having systematically a surface frequency LowSF M 572 +22 / +18 Model 1 Model 2 Model 3 higher than their derived forms (55.82 primes- primes occ./million) and the other half a surface fre- HighSF O 590 +4 quency lower than their derived forms (10.15 targets primes Appendix B. Fit indices for the hypothesized models occ./million according to Lexique database, New, U 594 Pallier, & Ferrand, 2001). Each target was primes primed by (1) a morphologically related word * : p < .05 �! (�) df CFI RMSEA AGFI GFI NFI (M, e.g., mariage-marier ‘wedding – to marry’), Picture Model 1 173.83 (.00) 27 .81 .17 .74 .87 .79 (2) an orthographically related word (O, e.g., The results show a clear pattern of a morphologi- naming Model 2 52.16 (.00) 27 .97 .07 .90 .95 .94 marine-marier, ‘navy-marry’) and (3) an unrelat- cal facilitation effect (reaction times decreases Adults Model 3 45.96 (.00) 22 .97 .08 .90 .96 .94 ed word (U, e.g., courage-marier, ‘courage- when the prime-target relationship is morpholog- Chinese- Model 1 169.46 (.00) 27 .81 .16 .75 .88 .79 marry’). In both the HighSF condition and the ical, compared to orthographic and unrelated English Model 2 45.69 (.01) 27 .98 .06 .91 .96 .94 LowSF condition, primes were matched in num- control conditions). translation Model 3 41.87 (.01) 22 .97 .07 .90 .96 .95 ber of letters (respectively 6.4 and 7 letters in A significant difference across conditions can be Picture Model 1 28.17 (.00) 11 .85 .27 .82 .93 .79 average) and surface frequency (respectively observed only when the target word has a lower naming Model 2 28.52 (.00) 11 .88 .12 .83 .94 .79 6.48 and 40.64 occ./million in average). Primes frequency than the primes. Statistical analysis Chil- Model 3 5.67 (.46) 6 1.00 .00 .93 .99 .96 were presented according to two frame durations showed that the critical net priming effects (dif- dren Chinese- Model 1 28.17 (.00) 11 .81 .12 .82 .93 .74 (SOAs), 48 and 66ms to examine the time-course ference between the reaction times for morpho- English Model 2 23.09 (.02) 11 .86 .10 .87 .95 .79 of priming. Three experimental lists were con- translation Model 3 5.67 (.46) 6 1.00 .00 .93 .99 .95 logical primes against orthographic and unrelated structed using a Latin square in order to present control ones) for HighSF primes - LowSF targets each target once only. was of 45* and 36*ms (respectively). Twenty-five students at the University of Tou- When looking at the LowSF-primes and HighFs Appendix C. SEM results of the hypothesized models louse participated in the experiment. All the par- targets the RTs differences of the net priming Adults: ticipants were native speakers of French and effects previously described, where not statisti- their average age was 26 (7.23 sd). They were all cally significant Morphological facilitation ef- Picture naming right handed and had normal to corrected-to- fects seem to be larger when the frequency of the normal vision. The experiment lasted around 40 prime is higher than the frequency of the target, minutes and in exchange for their time, partici- regardless of the SOA used. pants received a 4 Giga USB key.

The results are presented in Table 1. As we 2. Conclusion didn’t find any effect of the frame duration, we The results of the present study are in line with decided to present the averaged RTs. the previously found by Giraudo and Grainger Mistaken answers were not considered for the (2000), showing differential priming effects statistical analysis (2.8% of the data), neither when the surface frequency of the prime is ma- were reaction times lower than 250ms and over nipulated. The absence of a morphological prim- 1500ms (1% of the data). Cut-offs for the rest of ing effect in the High frequency M-primes/Low the data were set to 2.5 standard deviations from frequency targets contrasted with the strong sig- Model 1 Model 2 Model 3

56 129 Visual word recognition of morphologically complex words: Effects of L1-to-L2 translation prime word and root frequency

Hélène Giraudo Karla Orihuela [name] Laboratoire CLLE (Equipe ERSS) Laboratoire CLLE (Equipe ERSS) [address] CNRS & Université Toulouse Jean Jaurès CNRS & Université Toulouse Jean Jaurès [address] giraudo@univ -tlse2.fr [email protected] [address] [e-mail]

raudo & Voga, 2007; 2014; Giraudo & Grainger, Abstract 2000; 2001; but see also Aronoff, 1994 and Model 1 Model 2 Model 3 Booij, 2010 for the same linguistic view) or for The present study aims to investigate the the access ways to the lexicon, morphology in- relative role of the surface frequencies fluencing the simple development of orthograph- (i.e., token frequencies) in base word ic representations (e.g., Duñabeitia et al., 2007; recognition. A masked priming experi- Rastle & Davis, 2003; Rastle et al., 2004 and see ment was carried using two types of suf- Children: in the same vein Marantz, 2013). An interesting fixed French primes: the effects of words way to explore this issue is to use the masked Picture naming having a surface frequency (SF) higher priming paradigm (Forster & Davis, 1984) which than their base (e.g., cigarette – cigare) has been designed to measure the qualitative and were compared with those produced by the quantitative effects induced by the prior pro- word primes having a SF lower than their cessing of a morphologically complex word pre- base (e.g., froideur-froid ‘coldness- sented visually on the subsequent processing of cold’). Results show that HighSF are another -target- word. Behavioural data obtained more efficient primes than LowSF rela- with the masked priming paradigm associated tive t o both orthographic and unrelated with the lexical decision task revealed clear priming baselines. This suggests that de- strong morphological priming effects through spite a highly salient base, whole words various languages (Arabic: Boudelaa & Marslen- Model 1 Model 2 Model 3 matter more than morphemes during the Wilson, 2001; Basque: Duñabeitia, Laka, Perea, early processes of lexical access. & Carreiras, 2009; English: Rastle, Davis, 1 Introduction Marslen-Wilson & Tyler, 2000; French: Giraudo & Grainger, 2000; German and Dutch: Drews Morphological complexity has been extensively and Zwitserlood, 1995; Greek: Voga & Grainger, L1-to-L2 translation explored by psycholinguists in order to shed light 2004; Hebrew: Frost, Deutsch & Forster, 1997) on the role of morphology in lexical structuring. but the results are still controversial when ma- Starting from the idea - inherited from the con- nipulating the relative frequencies of the prime nectionist theory of visual word recognition (see and the target. On the one hand, some studies Seidenberg, 1987) - that the lexicon is comprised (Giraudo & Grainger, 2000) have revealed that of different levels of interconnected representa- larger effects are obtained when using high in tions reflecting the linguistics characteristics of comparison to low frequency derived primes en- the words as well as the cognitive processes by couraging the lexeme-based approach; on the the which complex words are recognized, the other, some authors (McCormick, Rastle, & Da- main issue regarding lexical morphology con- vis, 2009) have failed to observe an interaction cerns its specific role relative to word forms and between the prime frequency and morphological Model 1 Model 2 Model 3 semantics. Accordingly, morphology can be facilitation, strengthening the morpheme-based though as a structuring factor either for the lexi- approach. It has been suggested that these out- con, morphological relationships being expressed comes may be due to the fact that the methodo- by the mapping between from and meaning re- logical procedure among experiments varies flecting the construction of the words (e.g., Gi- (Amenta & Crepaldi, 2012), as they each use a

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

128 57 Activating Attributes in Frames Appendix A: Test Sentences in the AMT

Jens Fleischhauer Type of Verbs Item Question Target Sentence Department of General Linguistics Number Test sentences na2 ‘take’ 1 小明拿了小華一本雜誌 Heinrich-Heine-Universit¨at D¨usseldorf ban1 ‘carry’ 6 小張搬了小李一張桌子 Universit¨atsstrasse 1 reng1 ‘toss’ 9 張三扔了李四一件外套 40225 D¨usseldorf, Germany tou1 ‘steal’ 3 阿明偷了阿華兩瓶可樂 [email protected] mai4 ‘buy’ 8 張三買了李四一支毛筆 ying2 ‘win’ 12 小張贏了小李一隻手錶 1 Introduction tion is more complex than its adjectival equiva- Fillers sha1 ‘kill’ 2 小明殺了小華兩頭小羊 (matches both) lent. It is either possible to specify the tempo- gei3 ‘give’ 7 老李給了老張一隻小鳥 (matches both) The general topic addressed in this paper is the ac- ral extent (duration or frequency) of an eventu- dao3 ‘collapse’ 10 老王倒了小李一棵小樹 (ungrammatical) tivation of scalar attributes in the context of degree ality or to specify the degree of a gradable prop- gei3 ‘give’ 4 老李關了老張一隻小鳥 (ungrammatical) gradation of non-scalar verbs. Non-scalar verbs erty associated with the verb. The first type is song4‘give’ 5 小華送了小李兩瓶可樂 (matches neither) such as German stinken ‘stink’ do not lexically en- called ‘extent gradation’, the second is called ‘de- jiao1 ‘teach’ 11 張三教了瑪莉兩題數學 (matches neither) code a scale, meaning there is no scalar attribute gree gradation’ (Bolinger , 1972; L¨obner , 2012; Appendix B: Sample Answer Sheet of the AMT in their lexical representation. Nevertheless such Fleischhauer , 2014). Two German examples of verbs can be used in a degree context as in (1). In verbal degree gradation are shown in (2). Question Which Animation do If you tick ‘I don’t know’, please tick or state the the sentence, the intensifier sehr ‘very’ specifies Number you choose? reason the intensity of the dog’s smell. (2) a. Peter ist sehr gewachsen. 1. A I don’t Neither of the two animations is correct. Peter is very grown B know I do not understand the sentence that I heard. (1) Der Hund stinkt sehr. ‘Peter has grown a lot.’ Other reason ______the dog stinks very b. Peter hat sehr geblutet. The dog stinks very much. Appendix C: Test sentences in the AJT Peter has very bled If the verb does not lexicalize a scale, a scalar ‘Peter bled a lot.’ Type of Verbs Item Question Target Sentence attribute has to be activated in the degree context; Number Verbs of Consumption chi1 ‘eat’ 1 otherwise the degree construction could not be in- In (2-a), the intensifier sehr specifies the degree 李四吃了張三兩個蛋糕 he1 ‘drink’ 8 terpreted. Therefore, I will argue (i) that the scalar to which Peter increased in size; it is a vague, 小華喝了小明兩瓶紅酒 yong4 ‘use’ 17 小李用了小張一支鉛筆 attribute is retrieved from the conceptual knowl- context-dependent high degree (see Fleischhauer Verbs of Creation kao3 ‘bake’ 6 阿華烤了小明一個蛋糕 edge associated with a meaning component speci- (2013) for a deepter discussion of degree grada- * zhu3 ‘cook’ 12 小李煮了老張一頓晚餐 fied in the verb, and (ii) that frames provide a suit- tion of change of state verbs). In (2-b) the intensi- * zao4 ‘build’ 14 張三造了老李一棟房子 able means of representing the process of (scalar) fier indicates the quantity of emitted blood. * Control Sentences chi1 ‘eat’ 2 李四吃了兩個蛋糕 attribute activation. The aim of the paper is to il- There is a crucial difference between the verbs he1 ‘drink’ 9 小華喝了兩瓶紅酒 lustrate how this process is constrained. wachsen ‘grow’ and bluten ‘bleed’ in (2); the for- mer is lexically scalar, whereas the latter is not. yong4 ‘use’ 13 小李用了一支鉛筆 2 Verb gradation A verb is lexically scalar iff it expresses a scalar kao3 ‘bake’ 4 阿華烤了一個蛋糕 predication in every context of use (see, among zhu3 ‘cook’ 11 小李煮了一頓晚餐 Following Bierwisch ( 1989), gradation is a lin- others, Levin and Rappaport Hovav (2010) and zao4 ‘build’ 16 張三造了一棟房子 guistic process of comparing two degrees on a Fleischhauer and Gamerschlag ( 2014) on scalar Fillers gei3 ‘give’ 3 老李給了老張一隻小鳥 scale. Gradation is usually associated with ad- verbs). In (3-a) wachsen expresses a compari- song4 ‘give’ 7 小華送了小李兩瓶可樂 jectives, and languages like English and German son between the size of the child at the beginning jiao1 ‘teach’ 15 張三教了瑪莉兩題數學 have special adjectival degree morphology such of the event and its size at the end of the event. gei3 ‘give’ 5 *老李給了隔壁老張 as comparative -er and superlative -est in En- Hence, it expresses a scalar predication although song4 ‘give’ 10 *小華送了鄰居小李 glish. However, gradation is not restricted to ad- it is not modified by an intensifier. jiao1 ‘teach’ 18 *張三教了同學瑪莉 jectives (Sapir , 1944; Bolinger , 1972); verbs and nouns can also be graded (see e.g. Morzycki (3) a. Peter ist gewachsen. Appendix D: Sample Answer Sheet of the AJT (2009) on the gradation of nouns). Verbs and Peter is grown Very Unacceptable Acceptable Very nouns differ from adjectives in not having spe- ‘Peter has grown.’ cial degree morphemes (at least in English and b. Peter hat geblutet. Unacceptable Acceptable German). A further difference between the gra- Peter has bled 1. 阿明吃了我兩個蛋糕。 1 2 3 4 I don’t know dation of adjectives and verbs is that verb grada- ‘Peter bled.’

Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

58 127 The sentence in (3-b) does not compare the T It is, therefore, proposed that subjects’ experi- quantity of blood emitted by the boy to some other Group Source *Goal *Both Don’t know ence in L1 to resort to context in the face of am- biguity caused by verbs underspecified for direc- quantity; hence, the verb is lexically non-scalar. This means that only wachsen but not bluten lexi- Object Color NS 100 0 0 0 tionality helps advanced L2ers overcome over- COLOR: color generalization. The sensitivity trained in L1 is cally encodes a scale. AD 57 10 33 0 transferred to L2 learning and displayed in that Although the verb bluten is gradable (2-b), it Bark (...) red green blue (...) IN 23 17 57 3 more attention is paid to the co-occurring applied does not lexicalize a scale. The gradation scale argument in the face of ambiguous thematic role varies for different verbs: it is an intensity scale in Figure 2: Partial type signature. Table 2: Percentages of choice in the AMT assigned to applied argument. Advanced L2ers (1) and a quantity scale in (2-b). Since the scale might have accumulated enough indirect statisti- varies for different verbs, it is not contributed by a non-scalar attribute, such as COLOR. A 2-sample z-test was performed separately to cal information (Reali and Christiansen, 2005) the intensifier. Rather, a suitable gradation scale is To restrict the admissible attributes for a frame compare proportions between any 2 among the 3 tracked from co-occurrences of recurring se- rather from conceptual knowledge. groups. The results showed that any 2 groups quences of words before being able to overcome and the admissible values for an attribute, types were significantly different from each other in overgeneralization. This finding suggests that the 3 Frames can be assigned to frames. Types are ordered the choice for Source and for Both, but not sig- effects of L1 transfer result not only from the with regard to their specificity in a type signa- nificantly different in Goal. IN group as expected similarity and/or difference of linguistic facts Frames, in the sense of Barsalou ( 1992a; 1992b), ture (Carpenter , 1992), as shown in figure 2. The showed overgeneralization in wrongly choosing between the native and the target language, but are recursive attribute-value structures. A frame type signature defines ‘bark’ as a subtype of the Both, while AD group seemed to be able to over- also from L2ers’ experience gained in their na- is a representation of a concept and represents the type ‘object’; ‘red’, ‘green’ and ‘blue’ are de- come overgeneralization and limit the construc- tive language. referent of the concept in terms of its attributes, the fined as subtypes of ‘color’. The type signature tion of Chinese AO to LA-source from the fact values of the attributes, the attributes of the values is enriched with appropriateness conditions (ACs) that the choice for Both was greatly decreased References and so on. One way of representing frames is by which serve two tasks: first, they restrict the set and that for Source was greatly increased at the C. Cuervo. 2003. Datives at Large. PhD Thesis. MIT. using attribute-value matrixes (AVMs) like in fig- of appropriate attributes for frames to a certain higher proficiency level. ure 1. The AVM in figure 1 shows a partial frame type. Second, ACs specify the appropriate values As for the AJT, Table 3 presents the mean C. J. Huang. 2007. Hanyu dongci de tiyuan jiegou yu scores with the standard deviation in the brackets qi jufa biaoxian (The thematic structures of verbs for the concept ‘tree’ (based on Petersen and Os- for an attribute; it is required that all values of an in Chinese and their syntactic projection). Yuyan of each group by verb types. Using an alpha level swald (2012)). A tree consists of a crown and a attribute are of a certain type (see Petersen (2007), Kexue (Linguistic Sciences) 6(4): 3-21. 1 of 0.05, paired t-tests showed that only NS ex- trunk, hence CROWN and TRUNK are attributes Petersen et al. (2008), Petersen and Gamerschlag hibited significant difference in the responses to B. MacWhinney. 2004. A multiple process solution to in the frame of ‘tree’. The value of the attribute (2014)). COLOR restricts its values to be of the 2 types of verbs, while L2 groups did not. the logical problem of language acquisition. Jour- CROWN is the underspecified value or, in different type ‘color’ or one of its subtypes. Furthermore, nal of Child Language , 31, 833-914. terms, the uninstantiated type ‘crown’. The value the attribute COLOR is an appropriate attribute for Group Verb type L. Pylkkänen. 2000. What applicative heads apply to. of trunk is the uninstantiated type ‘trunk’ which ‘object’. Since ‘bark’ is a subtype of ‘object’, it Consumption Creation In M. Minnick, A. Williams, and E. Kaiser (eds.), can be further characterized as having an attribute inherits this AC. Thus, objects of the type ‘bark’ NS 3.53(0.39) 1.36(0.24) working papers in Proceedings of the 24th Annual BARK. The bark of the tree is characterized as hav- have a color but do not have, for example, a price, Penn Linguistics Colloquium 7(1). AD 3.22(0.54) 3.33(0.44) ing a certain color. since the type signature does not define PRICE as IN 3(0.34) 3.23(0.38) L. Pylkkänen. 2002. Introducing Arguments. PhD an appropriate attribute for ‘bark’. Thesis. MIT. Table 3: Mean scores for the AJT tree 4 Frame analysis of degree gradation L. Pylkkänen. 2008. Introducing Arguments. Cam- CROWN crown  bridge, MA: MIT Press. In contrast with the result in Table 2, AD group   In section 2, I suggested that the degree context TRUNK trunk BARK bark COLOR color  did not perform better in AJT than IN group in F. Reali and M. Christiansen. 2005. Uncovering the   activates the relevant gradation scale in the case   rejecting ungrammatical AO-Goal introduced by richness of the stimulus: Structure dependence and   of lexically non-scalar verbs. This process is not verbs of creation. The question is how we can indirect statistical evidence. Cognitive Science , 29, Figure 1: Partial frame for the concept ‘tree’. arbitrary but restricted by the lexical semantics of explain for AD group’s inconsistency in over- 1007-1028. the verb. There are two reasons for this assump- coming overgeneralization. Following L¨obner (1998; 2014) and Petersen T. Q. Sun and Y. F. Li. 2010. Hanyu fei hexin tion: First, each semantic class of gradable verbs Notice that the major difference between the 2 lunyuan yunzhun jiegou chu tan (Licensing non- (2007), attributes are partial functions; they as- is only related to a single gradation scale. Sec- tasks is whether the verb specifies directionality core arguments in Chinese), Zhongguo Yuwen sign a unique value to their possessor argument. ond, different semantic classes of verbs are related of transfer. Verbs included in the AMT are the (Studies of the Chinese Language ) 334: 21-33. The requirement of functionality provides a for- to different gradation scales. As discussed above, verbs that do not favor a particular direction of B. Yuan. 2010. Domain-wide or variable-dependent mal constraint on possible attributes. As attributes verbs of substance emission such as bluten ‘bleed’ transfer and therefore the introduced applied ar- vulnerability of the semantic-syntax interface in L2 are functions, it is possible to distinguish scalar gument is inherently ambiguous between Goal are related to a quantity scale (2-b), but verbs of acquisition? Evidence from wh-words used as exis- and non-scalar attributes by looking at their do- and Source in the L1 Spanish. In other words, the tential polarity words in L2 Chinese grammars. smell emission, like stinken ‘stink’ in (1), are re- mains. If the values in the domain are linearly or- verbs that trigger ambiguity in L1 Spanish are Second Language Research 26: 219-60. lated to an intensity scale. dered, the attribute is a scalar one (e.g. SIZE). If where subjects first overcome overgeneraliza- In the following, the analysis concentrates on there is no linear order of the domain’s values, it is tion. the verb bluten. The verb denotes a process of sub- 1Attributes are written in small capitals. stance emission. Its single argument is the emit-

126 59 ter, the one who is emitting blood. The emit- only attribute that can be activated in a degree con- was provided on the side which learners could tee, which is the emitted substance, is an im- text to provide a suitable gradation scale. 2 Methods choose if they were unsure of the response. See plicit semantic argument of the verb (Goldberg I propose the constraint in (4) as a restriction for To test our prediction on L1 transfer effects we Appendix D. (2005) speaks of an incorporated theme argu- the activation of scalar attributes in the frames of designed two tasks to probe different knowledge ment). A frame representation for bluten, cap- lexically non-scalar verbs: of L2 structures: one being implicit and mean- 2.2 Participants turing the mentioned aspects, is given in figure 3. ing–focused; the other being explicit and form- (4) Only meaning components that are lexi- 20 L2ers and 10 natives speakers (NS) of Chi- The boxed numeral in the frame indicates structure focused. nese serving as a control group participated in cally specified in the verb license the ac- sharing (Pollard and Sag , 1994) and indicates that this study. All NS were graduate students born tivation of scalar attributes. 2.1 Materials and Procedures the value of EMITTER is coextensive with a some and raised in Taiwan. Most L2ers were under- other structure, the externally specified subject. An Animation Matching Task (AMT) was used graduate students with the exception of 3 people In the frame for bluten (figure 3) only the emittee to probe L2er’s implicit knowledge because it is lexically specified as being blood. The emitter is being Catholic priests. L2ers had learned Chi- substance emission called for a focus on meaning. The AMT includ- nese in Taiwan for at least 3 years and came not specified in the verb, rather it is introduced by ed 12 items (6 test sentences and 6 fillers). The 6 from different Spanish-speaking countries. Span- EMITTER 1  the subject argument and therefore does not give test sentences included verbs underspecified for ish was the native language for all L2ers. English EMITTEE blood access to specific conceptual knowledge. directionality of transfer. The 6 fillers bore only   was the second most proficient language.   surface similarity and served to distract partici- Figure 3: Frame for the verbal concept bluten 5 Restricting the scalar attribute Before the study, L2ers had completed a 40-item pants’ focus in different ways. 2 contained syn- Chinese proficiency cloze test developed by Yu- ‘bleed’. An apparent problem is the claim that the frame tactically unacceptable sentences; another 2 con- an (2014). Based on the scores, they were divid- tained sentences that matched both animations; Degree gradation affects the quantity of the for bluten only contains one scalar attribute, ed into Advanced (AD) and Intermediate (IN) the other 2 contained sentences that matched nei- group. Table 1 summarizes the participants’ emitted blood; hence QUANTITY is an attribute namely QUANTITY. It is clearly the case that we ther of the two animations. See Appendix A. background information and cloze test scores. of the emittee. The frame representation for sehr cannot only speak of the quantity of blood but also On each trial, the L2ers first saw 2 animations of its temperature or pressure. TEMPERATURE as bluten ‘bleed a lot’ is shown in fig 4. The inten- on the computer screen. Next, they heard the tar- Group NS AD IN sifier sehr activates the scalar attribute QUANTITY well as PRESSURE are scalar attributes too, so the get sentence presented auditorily. Participants in the frame of bluten and specifies the value of question emerges why it is only QUANTITY but not were required to match the sentence to the cor- Number of 10 10 10 TEMPERATURE or PRESSURE that is activated in a QUANTITY as ‘high’. rect animation. For example, participants degree context? (6) Zhansan reng-le Lisi yi jian waitao. substance emission To tackle this problem one has to realize that Mean age 26.2 26.9 24.1 Zhangsan toss-PERF Lisi one CL coat EMITTER 1  the gradable verbs of substance emission are not (ranges in (22-28) (23-38) (20-36) restricted to those that express an emission of a EMITTEE blood QUANTITY high  Lit: ‘Zhangsan tossed Lisi one coat.’ brackets)   liquid like blood. Other verbs of this class express   The sentence was preceded by two animations: Duration NA 8.4 5.7   the emission of a solid like hair in (5). Figure 4: Frame for sehr bluten ‘bleed a lot’. (a) Zhangsan tossed one coat to Lisi; (b) Zhang- (years) of (5) Die Katze hat sehr gehaart. san tossed one of Lisi’s coats away. Participants formal As QUANTITY is an attribute of ‘blood’, it is the cat has very shed chose which animation was a better match for the the object knowledge associated with ‘blood’ that ‘The cat lost many hairs.’ sentence by ticking the answer on the answer instruction sheet. They were told at the beginning of the test licenses its activation. A partial frame for ‘blood’ Length NA 5.7 4.8 is given in figure 5. The type signature in figure 6 defines ‘liquid’ that if they found both animations matching the to be a supertype of ‘blood’ and ‘water’ and to be sentence, they could select both. If they found (years) of (3-11) (3-9) blood a subtype of ‘substance’. ‘Solids’ are also a sub- neither matching the sentence or if they could not residence in type of ‘substance’ and form the supertype of, for understand the sentence, they could choose CONSISTENCY liquid  Taiwan example, ‘hair’ and ‘scall’. The attributes shared ‘don’t know’ option on the side and choose/state COLOR red  the reason. See Appendix B. Cloze test 39 35 29   by liquids and solids are inherited from their com- QUANTITY quantity Following the AMT was the Acceptability   mon supertype, for example CONSISTENCY and score (38-40) (33-37) (27-32)   Judgment Task (AJT), which tapped participants’ Figure 5: Partial frame for the concept Blut QUANTITY. But there are attributes which ‘hair’ explicit knowledge on forms. 2 different types of (ranges in ‘blood’. and ‘blood’ do not share and these are inherited verbs that induced opposite directionality of brackets) from the more specific supertypes ‘liquids’ and transfer (i.e., grammatical LA-source and un- It is part of our knowledge of ‘blood’ that it has ‘solids’ respectively. For example, liquids do have grammatical LA-goal) were included, 3 items per Table 1: Participants’ Background Information a certain consistency (‘liquid’), has a certain color a temperature and a pressure but we do not think type. In addition, with 6 control sentences and 6 (‘red’) and is of a certain quantity. While the at- of solids in terms of the attributes PRESSURE and fillers, the AJT contained 18 items in total, half 3 Results and discussion tributes CONSISTENCY and COLOR have fixed val- TEMPERATURE. This does not result in the claim grammatical and half ungrammatical. Please see Table 2 presents the percentage of how often par- ues for blood, the value of QUANTITY is depen- that solids do not have a temperature but I do not Appendix C. Rating scale ranged from very un- ticipants chose a certain animation in the AMT dent on the possessor of the blood. In figure 5 the think that TEMPERATURE is an attribute in our ob- acceptable (1), unacceptable (2), acceptable (3), (for example, the (a) condition in example (6) only scalar attribute is QUANTITY, hence it is the ject knowledge of ‘hair’ or ‘scall’; so we do not to very acceptable (4). A ‘don’t know’ option above depicts a Goal condition).

60 125 T tribute is activated from the conceptual knowledge A User-Based Approach to Spanish-Speaking L2 Acquisition of associated with a meaning component lexically Chinese Applicative Operation Substance specified in the verb. Furthermore, the gradable at- CONSISTENCY: consistency tributes that can be activated are restricted to those QUANTITY: quantity Nana Y-H Huang inherited from the most specific common super- Cambridge University type. This ensures a homogeneous interpretation [email protected] of degree gradation of verbs of substance emis- Liquids Solids sion, otherwise degree gradation of verbs (of sub- directionality (e.g., vender ‘sell’ and alquilar CONSISTENCY: liquid CONSIST.: solid stance emission) would be totally idiosyncratic. 1 Introduction: Low Applicative Opera- ‘rent’) and verbs of motion (e.g., lanzar ‘throw’ TEMPERATURE: temperature Hair Scall Frames provide a suitable framework for the tions and pataer ‘kick’), the applied argument would PRESSURE: pressure analysis of the sketched phenomenon as they al- be ambiguous between a goal and a source. low representing lexical knowledge and concep- Recent studies of argument structure distin- Cuervo provides such an example as (3). Blood Water guishes non-core (applied) arguments from core tual knowledge in the same representational for- arguments in the sense that non-core ones do not (3) Valeria le vendió el auto a su hermano. Figure 6: Partial type signature. mat. The frame analysis in this paper concentrates belong to the basic argument structure of verbs Valeria CL sold the car DAT her brother on a single semantic verb class but it can easily be and that they enter argument structures through extended to cover other classes of gradable verbs, represents these concepts by using the attribute Applicative Operations (AO) introduced by func- 1. Valeria sold the/her car to her brother. for example verbs of smell/light/sound emission TEMPERATURE. tional heads such as Low Applicative-source 2. Valeria sold her brother’s car. or experiencer verbs, too. I propose that the gen- (LA-source) or Low Applicative-goal (LA-goal) As verbs of substance emission do not only ex- eral constraints formulated in (4) and (6) hold for heads (Pylkkänen, 2000; 2002; 2008; Cuervo, press the emission of liquids but of solids too, the 1.2 Applicative operations in Chinese these classes of verbs as well, the only difference 2003). Because languages make use of different admissible scalar attributes that can be activated consists in the associated conceptual knowledge. applicative heads, in this study, I examine the In Chinese, AO is as productive; nevertheless, in a degree context are restricted to those inherited The process of attribute activation is not re- acquisition of Chinese AO by Spanish-speaking unlike Spanish, Chinese only allows LA-source from the common supertype of liquids and solids, stricted to scalar attributs in the context of ver- L2 learners and propose a usage-based approach (see (4)) but not LA-goal (see (5)): which is ‘substance’. Since QUANTITY but not for the results collected from a comprehension bal degree gradation. A similar process occurs if (4) Zhangsan tou-le Lili liang tai diannao. TEMPERATURE or PRESSURE is inherited from task and an acceptability judgment task. verbs of sound emission are used for denoting mo- ‘substance’, it is only QUANTITY that can be ac- Zhangsan steal-PERF Lili two CL computer tion events like in (7) (based on Kaufmann (1995, 1.1 Applicative Operations in Spanish tivated in the context of degree gradation. Beside 93)). In this construction, a motion frame is acti- ‘Zhangsan stole Lili of two computers.’ the constraint in (4) a further constraint restricting Cuervo (2003) reports that in Spanish a predicate vated which is licensed by the fact that the motion the activation of scalar attributes is required: which expresses the transfer of a theme to a goal, (5) *Zhangsan sheji-le Lili liang jian qunzi. of a motorbike produces a yowling sound. In this such as verbs indicating creation (e.g. cocinar Zhangsan design-PERF Lili two CL skirt (6) The activation of scalar attributes is re- case and in opposition to verbal degree gradation, ‘cook/bake’, construer ‘build’, and etc.), allows stricted to those attributes which are inher- knowledge of the subject referent is relevant too. LA-goal, where the applied argument is the da- ‘Zhangsan designed Lili two skirts.’ ited form the most specific common super- tive argument, as in (1). (7) Das Motorrad jaulte uber¨ die 1.3 Research Questions type. (1) Valeria le diseñó una pollera a Anna. the motorbike yowled over the This study examines Spanish L2ers’ acquisition The most specific common supertype for emit- Valeria CL designed a skirt DAT Anna Kreuzung. of Chinese AO and considers the learnability table substances like ‘blood’ and ‘hair’ is ‘sub- crossing Lit.: ‘Valeria designed Anna a skirt.’ problem posed by the superset-subset relation stance’. Hence, (6) restricts the activation of scalar ‘The motorbike yowled over the crossing.’ between Spanish and Chinese on this structure A Spanish applied argument can also appear in attributes to those which are inherited from ‘sub- (i.e. Spanish allows both LA-goal and LA-source the environment of a transfer predicate with ‘re- stance’; those attributes inherited from a more spe- It is a promising task for the future to explore while Chinese allows only LA-source). We pre- verse directionality’, such as robar ‘steal’, sacar cific supertype like ‘liquids’ cannot activated in a the process of attribute activation in more details dict learners to wrongly transfer LA-goal, which ‘take from’, and extraer ‘take out from’. In this degree context. and to see how the activation of attributes from the is allowed in L1 Spanish, to L2 Chinese despite case the applied argument is understood as the conceptual knowledge is constrained by lexical se- the lack of positive evidence for the use of LA- possessive source of the theme object. 6 Conclusion mantics and other factors. goal in L2 input. Furthermore, due to lack of (2) Pablo le robó la bicicleta a Anna. negative evidence (from the fact that AO do not In this paper, I have shown that lexically non- appear in pedagogical textbooks nor in class- Pablo CL stole the bike DAT Anna scalar verbs can be graded by intensifiers like sehr. Acknowledgments rooms designed for L2ers), L2 Chinese input But this requires the activation of a suitable scalar Lit.: ‘Pablo stole Anna the bike.’ lacks information regarding ungrammaticality of attribute, otherwise the degree construction could The paper is a result of my work in the Collabora- The source argument appears in dative case LA-goal, which would be necessary for L2ers to tive Research Center “The Structure of Represen- rule out incorrect hypotheses. That is, these not be interpreted. The process of attribute activa- which has the same morphosyntactic properties tations in Language, Cognition, and Science” sup- learners are expected to show overgeneralization tion is not unconstrained, rather the lexical mean- of a recipient argument; therefore, it is predicted ported by the German Science Foundation (DFG). from early on till even at the advanced level. ing of the verb as well as conceptual knowledge that in the context of verbs with underspecified provide constraints on this process. The scalar at-

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage . Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

124 61 References [L¨obner 2014] Sebastian L¨obner. 2014. Evidence for frames from natural language. In T. Gamerschlag, Dressler, W. (1985). On the predictiveness of Natural sink, & D. Sandra (Eds.) Reading complex words. [Barsalou 1992a] Lawrence Barsalou. 1992a. Cogni- D. Gerland, R. Osswald, and W. Petersen (eds.), Morphology. Journal of Linguistics 21, 321 Amsterdam: Kluwer, 113-137 tive Psychology. An overview for cognitive scientists. Frames and Concept Types: Applications in Lan- Duñabeitia J. A., Perea M. & Carreiras M. (2007). Do Thornton, A.M. Iacobini, C. & Burani, C. (1997). Hillsdale/ NJ, Lawrence Erlbaum Association. guage and Philosophy, 23–68. Dordrecht/ Heidel- transposed-letter similarity effects occur at a mor- BDVDB Una base di dati sul Vocabolario di Base berg/ New York, Springer. pheme level? Evidence for morpho-orthographic della lingua italiana, seconda edizione riveduta e [Barsalou 1992b] Lawrence Barsalou. 1992b. Frames, decomposition. Cognition 105, 691–703. ampliata. Roma. Bulzoni concepts, and fields. In A. Lehrer and E. F. Kittay [Morzycki2009] Marcin Morzycki. 2009. Degree Forster, K.I. & Davis, C. (1984). Repetition priming Vroomen, J. & de Gelder, B. (1999). Lexical access (eds.), Frames, fields, and contrasts, 21–74. Hills- modification of gradable nouns: size adjectives and and frequency attenuation in lexical access. Jour- of resyllabified words: evidence from phoneme dale/ NJ, Lawrence Erlbaum Association. adnominal degree morphemes. Natural Language nal of Experimental Psychology: Learning, monitoring. Memory and Cognition 27 (3), 413-21. Semantics 17:175–203. Memory, and Cognition, 10, 680-698. [Bierwisch 1989] Manfred Bierwisch. 1989. The Se- Forster, K.I. (1999). The microgenesis of priming [Petersen 2007] Wiebke Petersen. 2007. Representa- mantics of Gradation. In M. Bierwisch and E. Lang effects in lexical access. Brain and Language, 68, tion of Concepts as Frames. In J. Skilters, F. Toc- (eds.), Dimensional Adjectives, 71–261. Berlin, 5-15. Springer. cafondi, and G. Stemberger (eds.), Complex Cog- nition and Qualitative Science. The Baltic Interna- Frost, R., Kugler, T., Deutsch, A. & Forster, K.I (2005). Orthographic structure versus morphologi- [Bolinger 1972] Dwight Bolinger. 1972. Degree tional Yearbook of Cognition, Logic and Communi- cation. Vol 2, 151–170. Riga, University of Latvia. cal structure: principles of lexical organization in a Words. Mouton, The Hague. given language. Journal of Experimental Psychol- [Petersen et al.2008] Wiebke Petersen, Jens Fleis- ogy: Learning, Memory and Cognition, 31, 1293- [Carpenter 1992] Bob Carpenter. 1992. The Logic of chhauer, Peter B¨ucker, and Hakan Beseoglu. 2008. 1326. Typed Feature Structures. Cambridge, Cambridge A Frame-based Analysis of Synaesthetic Metaphors. Giraudo, H. & Grainger, J. (2003). On the role of der- University Press. The Baltic International Yearbook of Cognition, ivational affixes in recognizing complex words: Logic and Communication. Vol 3, 1–22. Evidence from masked affix priming. In R. H. [Fleischhauer 2013] Jens Fleischhauer. 2013. Interac- Baayen and R. Schreuder (Eds.), Morphological tion of telicity and degree gradation in change of [Petersen and Gamerschlag 2014] Wiebke Petersen and Structure in Language Processing. Mouton de Thomas Gamerschlag. 2014. Why chocolate eggs state verbs. In B. Asrenijevic, B. Gehrke and R. Gruyter: Berlin, 209-232) Marin (eds.), Studies in Composition and Decom- can taste old but not oval: A frame-theoretic anal- Giraudo, H. & Montermini, F. (2010). Primary stress position of Event Predicates, 125–152. Dordrecht, ysis of inferential evidentials. In T. Gamerschlag, Springer. D. Gerland, R. Osswald, and W. Petersen (eds.), assignment in Italian: linguistic and experimental Frames and Concept Types: Applications in Lan- issues. Lingue e Linguaggio, 2, 113-129. [Fleischhauer 2014] Jens Fleischhauer. 2014. Degree guage and Philosophy, 199–220. Dordrecht/ Hei- Grossmann, M. & Rainer, F. (eds.), La formazione Gradation of Verbs. Doctoral dissertation, Heinrich- delberg/ New York, Springer. delle parole in italiano. Tübingen: Niemeyer. Heine-Universit¨at D¨usseldorf. Nespor, M. & Vogel, I. (2007). Prosodic Phonology. [Petersen and Osswald 2012] Wiebke Petersen and Mouton de Gruyter: Berlin. [Fleischhauerand Gamerschlag2014] Jens Fleis- Tanja Osswald. 2012. A Formal Interpretation of Orsolini M. & Marslen-Wilson W.D. (1997), Univer- chhauer and Thomas Gamerschlag. 2014. We’re Concept Types and Type Shifts. In K. Kosecki and sals in morphological representation: Evidence going through changes: How change of state verbs J. Badio (eds.), Cognitive Processes in Language, from Italian. Language and Cognitive Processes an arguments combine in scale composition. Lingua 183–191. Frankfurt, Peter Lang. 12: 1-47. Rastle, K. & Davis, M. H. 2008 Morphological de- 141:30–47. [Pollard and Sag 1994] Carl Pollard and Ivan A. Sag. composition based on the analyses of orthography. 1994. Head-Driven Phrase Structure Grammar. In : Language and Cognitive Processes. 23, 7-8, p. [Goldberg 2005] Adele Goldberg. 2005. Argument re- Chicago, The University of Chicago Press. alization: The Role of constructions, lexical seman- 942-971 tics and discourse factors. In J.-O. Ostman and M. [Rappaport Hovav and Levin 2010] Malka Rappaport Rastle, K., Davis, M.H. & New B. (2004). The broth Fried (eds.), Construction Grammars: Cognitive Hovav and Beth Levin. 2010. Reflections on in my brother’s brothel: Morpho-orthographic Grounding and Theoretical Extensions, 17–43. Am- manner/result complementarity. In M. Rappaport segmentation in visual word recognition. Psycho- sterdam, John Benjamins. Hovav, E. Doron and I. Sichel (eds.), 21–38. nomic Bulletin and Review 11 (6), 1090-1098. Oxford, Oxford University Press. Syntax, Lexical Rastle, K., Davis, M.H., Marslen-Wilson, W.D. & [Kaufmann 1995] Ingrid Kaufmann. 1995. What is an Semantics, and Event Structure Tyler, L.K. (2000). Morphological and semantic (im)possible verb? Restrictions on Semantic Form effects in visual word recognition: A time-course and their consequences for argument structure. Fo- [Sapir 1944] Edward Sapir. 1944. Grading: A Study in study. Language and Cognitive Processes 15 (4-5), lia Linguistica XXIX/1-2:67–103. Semantics. Philosophy of Science 11(2):93–116. 507-537. Reid, A.A. & Marslen-Wilson, W.D. (2003), Lexical [L¨obner 1998] Sebastian L¨obner. 1998. Definite Asso- representation of morphologically complex words: ciative Anaphora. In S. Botley (ed.), Proceedings Evidence from Polish. In Baayen, R.H. & of DAARC96 - Discourse Anaphora and Resolution Schreuder, R. (eds.), Morphological Structure in Colloquium. Lancaster University, July 17th-18th. Language Processing, 287-336 Lancaster. Stanners, R.F., Neiser, J.J., Hernon, W.P. & Hall, R. (1979). Memory representation for morphological- [L¨obner2012] Sebastian L¨obner. 2012. Sub- ly related words. Journal of Verbal Learning and compositionality. In M. Werning, W. Hinzen and Verbal Behavior, 18, 399-412. E. Machery (eds.), The Oxford Handbook of Com- Taft, M. (2003). Morphological representation as a positionality, 220–241. Oxford, Oxford University Press. correlation between form and meaning. In E. As-

62 123 the boundary of the syllable, whereas with – strength of the connection between words shar- Modelling semantic transparency in English compound nouns ico and –etto the suffix is split in the two last ing the same stem (educare / EDUCATORE, nos- syllables. In the Natural Morphology frame- talgia / NOSTALGICO, pezzo / PEZZETTO). work, the more the morphology overlaps with In the second experiment we will focus on the Melanie J. Bell Martin Schäfer the phonological components (i.e. the higher issue of the sequential organization of the word, Anglia Ruskin University Friedrich Schiller University the morpho-tactic transparency) the easier the namely that the access and processing of a suf- Cambridge Jena recognition; fixed word is affected by the position of the suf- U.K. Germany iii. word stress: the suffixes -tore and –etto fix at the end of the word and by the (visual) per- [email protected] [email protected] always carries the word stress, while -ico does ception of the final part of the word. In order to not. Moreover, in Italian, the stressed syllable verify this aspect, we will use the same critical has a long vowel [–’to:re] which, although not materials as in the first experiment but we will phonological, may constitute a perceptual hint manipulate the location of the fixation point. 1 Introduction the degree of expectedness of a particular word for an easier identification. Finally, words Specifically, in the forward mask which pre- sense and a particular relation for a given con- Semantic transparency is known to play an im- with –tore and –etto show the more frequent cedes the presentation of the prime/target pairs, stituent. In this paper, we provide evidence in portant role in the storage and processing of stress pattern in Italian (about 80% of the the fixation marks (####), whose aim is to focus support of this hypothesis: the more expected the complex words (e.g. Marslen-Wilson et al. words have the word stress on the penultimate attention on a certain point of the screen, will word sense and relation for a constituent, the 1994), and human raters of transparency achieve syllable, Thornton, Iacobini & Burani 1997, overlap with the suffix position. more transparent it is perceived to be. high levels of agreement (e.g. Frisson et al. 2008, see Burani & Arduino 2004 and Giraudo & To sum up, our research will contribute to ver- Munro et al. 2010). In the case of noun-noun Montermini 2010 on the effect of stress regu- ify the role of suffixes and morphological sche- 2 Method compounds, overall transparency is largely de- larity and stress consistency in stress assign- mas in the access and processing of Italian com- termined by the transparency of the individual We used the publicly available dataset described ment for Italian words). plex words and to investigate whether (and pos- constituents. For example, Reddy et al. (2011) in Reddy et al. (2011), which gives human trans- According to these criteria – tore is the most sibly to what extent) suffix salience affects such showed that the perceived transparency of a parency ratings for a set of 90 compound types salient suffix and –etto is more salient than –ico. process. Results will indicate if native speakers compound is highly correlated with both the sum and their constituents (N1 and N2), and compris- In the first experiment we will verify: a) of Italian organize lexical items according to and the product of the perceived transparencies es a total of 7717 ratings. To model the expect- whether words with a perceptually salient suffix morphological series as they do according to of its constituents. Furthermore, many psycho- edness of word senses and semantic relations for like –tore are recognized faster than words with a morphological families. linguistic studies find significant effects for se- a given compound constituent, we used the con- less salient suffix like –ico. If this would be the mantic transparency using a four-way distinction stituent families of the compounds, which we case, the word lavoratore should prime viaggia- References based on perceived constituent transparency: extracted in a two step process. We took all tore better than ironico primes metallico; b) Alessandro Laudanna, Cristina Burani, Antonella transparent-transparent (e.g. carwash), transpar- strings of exactly two nouns that follow an article whether a word belonging to a more consistent Cermele(1994). Prefixes as processing units. Lan- ent-opaque (e.g. jailbird), opaque-transparent in the British National Corpus and which also word ending series (like –tore) is recognized guage and Cognitive Processes, 9, 295-316 (e.g. strawberry) and opaque-opaque (e.g. hog- occur four times or more in the USENET corpus faster than a word belonging to a less consistent Bertinetto, P. M., Burani, C., Laudanna A., Marconi wash) (Libben et al. 2003). Bell and Schäfer (Shaoul and Westbury 2010). From this set, we word ending series (like –etto). According to this L., Ratti D., Rolando C., Thornton A. M. (2013) modelled the transparency of individual extracted the positional constituent families for hypothesis, we expect higher priming effect for (2005). Corpus e Lessico di Frequenza dell'Italia- compound constituents and showed that shifted all constituent nouns in the Reddy et al. dataset, words with –tore than for words with –etto. no Scritto (CoLFIS). word senses reduce perceived transparency, giving a total of 4553 compounds for the N1 The affix condition (our test condition), i.e. http://linguistica.sns.it/CoLFIS/CoLFIS_home.htm. while certain semantic relations between constit- families and 9226 for the N2 families. Each of the effect of the presentation of a suffixed word Booij, G. (2010). Construction Morphology. Oxford: Oxford University Press. uents increase it. However, this finding is prob- these compound types was coded for the seman- as a prime on the recognition of a complex target Burani, C., & Arduino, L.S. (2004). Stress regularity lematic in at least two ways. Firstly, it is not tic relation between the constituents (after Levi word with the same suffix (servitore/ EDUCA- or consistency? Reading aloud Italian polysyllables clear whether there is a solid basis for establish- 1978), and for the WordNet sense of the constit- TORE, sinfonico / NOSTALGICO, boschetto/ with different stress patterns. Brain and Language, ing whether a specific word sense is shifted or uent under consideration (Princeton 2010). We PEZZETTO ), will be considered in relation to 3 90, 318-325. not. For example, card in credit card is clearly then calculated the proportion of compound other conditions: the identity condition (educa- Bybee, J. (1988). Morphology as lexical organization. shifted if viewed etymologically, but may not types in each constituent family with each se- tore/ EDUCATORE, nostalgico/ NOSTALGICO, In M. Hammond & M. Noonan (eds.), Theoretical synchronically be perceived as shifted due to its mantic relation (relation proportion), and each pezzetto/ PEZZETTO) which should yield the Morphology. Approaches to modern linguistics frequent use. Secondly, work on conceptual WordNet sense of the constituent in question main facilitation effect and consequently the (pp. 119-142). San Diego: Academic Press. combination by Gagné and collaborators has (synset proportion). We take these two measures shortest RTs and the unrelated condition (colom- Bybee, J. (1995). Regular morphology and the lexi- shown that relational information in compounds to reflect the expectedness of the respective rela- ba / EDUCATORE, approccio/ NOSTALGICO, con. Language and Cognitive Processes, 10, 5, 425-55. is accessed via the concepts associated with indi- tions and WordNet senses of the constituents: if a ombelico/ PEZZETTO) which, on the contrary, is Clahsen, H., Sonnenstuhl, I. & Blevins, J.P. (2003). vidual modifiers and heads, rather than inde- relation or sense occurs in a high proportion of expected to yield the smallest facilitation effect Derivational morphology in the German mental pendently of them (e.g. Spalding et al. 2010 for the constituent family, it is more expected. These and the longest RTs. These two conditions are lexicon: A Dual Mechanism account. In H. Baayen an overview). This leads to the hypothesis that it variables were used, along with other quantita- considered as baselines to assess RTs obtained in & R. Schreuder (Eds.), Morphological structure in is not whether a specific word sense is etymolog- tive measures, as predictors in ordinary least the test condition. Moreover, in the stem condi- language processing (pp. 125-155). Mouton de ically shifted, nor whether a specific semantic squares regression models of perceived constitu- tion we will contrast the strength of the connec- Gruyter: Berlin. relation is used per se, that makes a compound ent transparency. The final model for the trans- tion between words with the same suffix and Derivatario, http://derivatario.sns.it/ constituent more or less transparent; rather, it is parency of N1 is given in Table 1: morphological schema (test condition) with the Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

122 63 Coef S.E. t Pr(>|t|) Intercept -4.6413 0.6593 -7.04 <0.0001 al, 2004) of all forms that can be potentially split than French, Italian has relatively long suffixes relation proportion in N1family -0.2187 0.6013 -0.36 0.7161 into two “surface morphemes” (see for details (e.g. lat. -ĭttu(m) > it. -etto vs. fr. –et, realized log family size of N1 -0.0189 0.0931 -0.20 0.8395 Rastle & Davis, 2008) acknowledging to both phonetically as [e] as in it. muretto/fr. muret). synset proportion in N1family -0.2426 0.6152 -0.39 0.6934 stems and affixes an equal status of access units Moreover, as a result of the fact that Italian log synset count of N1 -0.7939 0.2469 -3.22 0.0013 during word recognition. has undergone little phonological reduction, it compound proportion in N1 family (token-based) 3.0130 0.6788 4.44 <0.0001 However, when Giraudo & Grainger 2003 ad- has a high degree of orthographic transparency log frequency of N1 0.8728 0.0569 15.34 <0.0001 dressed this issue using French materials and an and consistency, which can contribute to the per- relation proportion * log family size 0.3311 0.1305 2.54 0.0113 experimental design controlling the effect of ception and representation of functional word synset proportion * log synset count 0.6855 0.3161 2.17 0.0303 morphological primes relative to formal primes, endings (Taft 2003). compound proportion * log frequency N1 -0.2804 0.0816 -3.44 0.0006 results did not show any reliable morphological Finally, although in Italian the great majority priming effect, i.e. both priming conditions pro- of suffixed words are paroxytone, i.e. stressed on Table 1: Final model for the transparency of N1, R2 adjusted = 0.334 duced significant priming effects relative to the the penultimate syllable, as suffix generally carry unrelated baseline but the morphological condi- the word stress, there is a limited number of pro- tion did not yield significantly faster RTs with paroxytone words (i.e. stressed on the third to respect to the orthographic condition. Note that, last syllable, with a suffix which does not carry

4.0 according to within priming comparisons, the the word stress). Consequently, suffixed words in 1.0 5 6 3.0 effect of morphological primes is compared to Italian can have different prosodic contours and 3.5 1.5 10 4 the effect of the orthographic primes on the same suffixes can show different degrees of perceptual 5 2.5 2.0 3.0 3 targets, e.g., fumet ‘scent’ - MURET ‘down wall’ prominence at the prosodic level. For these rea- 4 2.0 2.5 8 2 vs. béret ‘beret’- MURET ‘down wall’, conside- sons, we considered Italian as an ideal test situa- 1.5 3 2.5

1 log frequency of N1 of logfrequency log family size of N1 sizeof logfamily 6 ring that fumet and muret share the same functio- tion to verify the role of salience on suffixed log synset count of N1 of count logsynset 1.0 3.0 2 0 nal suffix –et, while béret and muret do not be- word processing and access.

0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 cause béret is a monomophemic word in French More precisely, for our experiments we select- relation proportion in N1family synset proportion in N1family compound proportion in N1 family (token-based) and ber- is not a possible stem. Giraudo and ed some productive suffixes –tore, –ico and -etto Figure 1. Interaction plots for N1 transparency Grainger, who conversely found in the same because they show different segmental and pro- study clear morphological priming effects when sodic features. manipulating prefixed words, interpreted these Moreover, they have different degrees of 3 Results 4 Conclusion asymmetrical results on the base of different se- functional consistency, i.e. a different propor- mantic and syntactic functions carried by prefix- tion between suffixed and non-suffixed words All predictors in our model enter into significant Overall, the model provides clear evidence for es and suffixes in French. An alternative expla- (i.e. monomophemic words) in a series of interactions, and these are shown graphically in our hypothesis. N1 is rated as most transparent nation for the results of Giraudo & Grainger words ending with a given letter string (Lau- Figure 1, where the contour lines on the plots when it is a frequent word, with a large family, study could be linked to the issue of perceptual danna et al. 1994). As a matter of fact, while represent perceived transparency of the first con- occurring with its preferred semantic relation and salience of suffixes (i.e. their size and segmental- 78% of the words ending with –tore and stituent (N1). The first plot shows an interaction most frequent sense, and with few other senses to prosodic features) and to the connected degree of 52,04% of words with –ico are suffixed, only between relation proportion and overall (log) compete. We interpret the results as indicating suffix likelihood (the probability for a word to be 20% of the words ending with –etto is suffixed family size: for small families, relation propor- that compound constituents are perceived as a suffixed word). As a matter of fact, it seems (quantitative data are taken from COLFIS and tion plays little role, whereas for larger families, more transparent when they are more expected that the more a word ending is salient and func- Derivatario). The criteria according to which in accordance with our hypothesis, the transpar- (both generally and with a specific sense) and tionally consistent, the stronger the probability it we defined the perceptual salience of the suf- ency of N1 increases with the proportion of the when they occur in their most expected semantic is a suffix. fixes are: corresponding relation in the family. The second environments. In information theory, the less i. size of the suffix (number of phonemes plot shows the interaction between the synset expected an event, the greater its information and graphemes); proportion and the total number of a constitu- content: in so far as perceived transparency is a 2 The present study ii. different degrees of morpho-tactic ent’s senses (as listed in WordNet): only if there reflection of expectedness, it can therefore also transparency (Dressler 1985) and of phono- On such premises, in the present research we is a sufficient number of different senses in the be seen as the inverse of informativity. logical integration of the suffix to the base, in verify by means of a masked priming experiment family is their proportion a reliable predictor of particular in relation to the phenomenon of: and a within-comparison design whether the pro- semantic transparency. There is also a small but Acknowledgements - resyllabification: no resyllabification cessing of morphologically complex words is significant interaction between the log frequency takes place with -tore which has always two This work was made possible by three short visit affected by the morphological schema and, more of a constituent and the proportion of the constit- syllables, independently from the root, where- grants from the European Science Foundation specifically, whether the processing is affected uent family (in terms of tokens) represented by as –ico, and –etto, starting with a vowel, are through NETWORDS - The European Network by the formal salience of the suffix. the compound in question: this shows that trans- more integrated with the stem ([i] and [e] be- on Word Structure (grants 4677, 6520 and 7027), We choose to run the experiments on Italian parency increases with frequency, but only in the come the coda of the last syllable of the stem for which the authors are extremely grateful. not only because Italian has a rich, productive lower frequently ranges does the proportion in (sto.ria/ sto.ri.co) and the suffixed word is re- and relatively regular morphology, but also be- the family play a role. syllabified); cause, being a phonetically ‘conservative’ lan- - morphological boundary: with -tore the guage, at least significantly more conservative boundary of the suffix always coincides with

64 121 Suffix perceptual salience in morphological References processing: evidence from Italian Bell, Melanie J. and Martin Schäfer. 2013. Semantic transparency: challenges for distributional semantics. In Aurelie Herbelot, Roberto Zamparelli and Gemma Boleda eds., Proceedings Hélène Giraudo Serena Dal Maso of the IWCS 2013 workshop: Towards a formal Laboratoire CLLE (Equipe ERSS) Dip. Lingue e Letterature Straniere distributional semantics, 1–10. Potsdam: CNRS & Université Toulouse Jean Jaurès [email protected] Association for Computational Linguistics. giraudo@univ -tlse2.fr Frisson, Steven, Elizabeth Niswander-Klement and Alexander Pollatsek. 2008. The role of semantic transparency in the processing of English com- pound words. British Journal of Psychology 991, es on the effect of the (visual) presentation of a 87–107. Abstract stimulus word (the ‘prime’) on the recognition of Levi, Judith N. 1978. The syntax and semantics of a target word. Experimental results indicate that complex nominals. New York: Academic Press. The goal of the present research is to de- the recognition of the target word is faster when termine the role of suffixes and morpho- Marslen-Wilson, William, Lorraine K. Tyler, it is preceded by a morphologically related prime logical schemas in the access and pro- Rachelle Waksler and Lianne Older. 1994. Mor- (e.g. kindness/ KIND), compared to cases where cessing of Italian complex words and to phology and meaning in the English mental lexi- it is preceded by an unrelated word (e.g. raw/ investigate whether (and possibly to what con. Psychological Review 101, 1: 3-33. KIND) or by an only orthographically similar extent) suffix salience affects such pro- Munro, Robert, Steven Bethard, Victor Kuperman, word (e.g. kin/ KIND; kite/ KIND). According to cesses. Two experiments using the Vicky Tzuyin Lai , Robin Melnick, Christopher Forster, these results show that “the cortical rep- masked-priming methodology will con- Potts, Tyler Schnoebelen and Harry Tily. 2010. resentations of the prime and the target are inter- tribute to verify if native speakers of Ital- Crowdsourcing and language studies: the new connected or overlap in some way such that the ian organize lexical items according to generation of linguistic data. In Proceedings of the representation of the prime automatically acti- NAACL HLT 2010 Workshop on Creating Speech morphological series as they do accord- vates the representation of the target word” (For- and Language Data with Amazon's Mechanical ing to morphological families. ster, 1999). Turk, pp. 122-130. Association for Computational Linguistics. 1 Introduction On the other hand, the relationship between words with the same suffix and the same mor- Princeton University. 2010. WordNet. In usage-based approaches to language represen- phological schema (in constructional terms), like tation and process (mainly Bybee’s Network kindness/ happiness/ sadness, has been scarcely Reddy, Siva, Diana McCarthy and Suresh Manandhar. Model and Booij’s Constructional Morphology), investigated yet and results do not allow a con- 2011. An empirical study on compositionality in morphology is generally conceived as organizing sistent and univocal interpretation. Marslen- compound nouns. In Proceedings of The 5th In- the lexicon according to two main dimensions: i) Wilson et al. 1996 investigated the role of suffix- ternational Joint Conference on Natural Lan- morphological families, i.e. words connected es in English with a cross-modal technique and guage Processing 2011 IJCNLP 2011, Chiang because sharing the same root: kind/ kindness/ found a significant priming effect for morpholog- Mai, Thailand ically related words (e.g. darkness/ TOUGH- kindly/ unkind/ kind-hearted, etc. and ii) morpho- Shaoul, Cyrus and Chris Westbury. 2010. An logical series, i.e. words connected because shar- NESS) and no hints of orthographic priming anonymized multi-billion word USENET corpus ing the same affix kindness/ happiness/ sadness/ when the overlap did not involve real suffixes 2005-2010 abruptness, etc. Psycholinguistic research has (e.g. darkness / HARNESS). More recently, Du- http://www.psych.ualberta.ca/˜westburylab/downl mostly confirmed this view, demonstrating with ñabeitia, Perea & Carreiras 2008 found signifi- oads/usenet.download.html experimental data that words in the mental lexi- cant facilitation effects on the recognition of suf- Spalding, Thomas L., Christina L. Gagné, Allison C. con are stored according to formal and semantic fixed words in Spanish employing a series of Mullaly and Hongbo Ji. 2010. Relation-based in- similarity, thus following morphological princi- experiments with different degrees of prime terpretation of noun-noun phrases: A new theoret- ples. segmentation: 1) er/ WALKER; 2) %%%%er/ ical approach. Linguistische Berichte Sonderheft More specifically, the relationship between WALKER; 3) baker/ WALKER. The experi- 17, 283-315 ments revealed priming effects in all the condi- morphologically complex words and their roots Wurm, Lee H. 1997. Auditory processing of prefixed (or other members of the same morphological tions (independently from the degree of segmen- English words is both continuous and family) has been extensively investigated by tation of the prime) and a clear dissociation be- decompositional. Journal of Memory and Lan- means of the masked-priming experimental par- tween orthographic and morphological priming guage, 37, 438–461. adigm (i.e. Stanners, Neiser, Hernon & Hall, (e.g. brevidad primes igualdad but volumen does 1979; Rastle, Davis, Marslen-Wilson & Tyler, not prime certamen). Taken together these re- 2000; Clahsen, Sonnenstuhl & Blevins, 2003; sults were interpreted as a strong evidence in fa- Rastle, Davis & New, 2004; Frost, Kugler, vor of an early prelexical morphological decom- Deutsch & Forster, 2005). This technique focus- position (e.g., Duñabeitia et al., 2007; Rastle et

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

120 65 [4] Katherine Nelson. The syntagmatic- A bottom up approach to category mapping and meaning change paradigmatic shift revisited: a review of research and theory. Psychological Bulletin, Haim Dubossarsky Yulia Tsvetkov Chris Dyer Eitan Grossman 84(1):93, 1977. The Edmond and Lily Language Tech- Language Tech- Linguistics Department and [5] Charles Egerton Osgood. Cross-cultural uni- Safra Center for Brain nologies Institute nologies Institute the Language, Logic and versals of affective meaning. University of Illi- Sciences Carnegie Mellon Carnegie Mellon Cognition Center nois Press, 1975. The Hebrew Universi- University University The Hebrew University of ty of Jerusalem Pittsburgh, PA Pittsburgh, PA Jerusalem Jerusalem 91904, Is- 15213 USA 15213 USA Jerusalem 91904, Israel rael ytsvetko cdyer eit- haim.dub@gmail @cs.cmu.edu @cs.cmu.edu [email protected] .com uji.ac.il

few single words to few dozen words. Only re- Abstract cently though, have usage-based approaches (Bybee, 2010) become prominent, in part due to In this article, we use an automated bot- their compatibility with quantitative research on tom-up approach to identify semantic large-scale corpora (Geeraerts et al., 2011; categories in an entire corpus. We con- Hilpert, 2006; Sagi et al., 2011). Such approach- duct an experiment using a word vector es argue that meaning change, like other linguis- model to represent the meaning of words. tic changes, are to a large extent governed by and The word vectors are then clustered, giv- reflected in the statistical properties of lexical ing a bottom-up representation of seman- items and grammatical constructions in corpora. tic categories. Our main finding is that In this paper, we follow such usage-based ap- the likelihood of changes in a word’s proaches in adopting Firth’s famous maxim meaning correlates with its position with- “You shall know a word by the company it in its cluster. keeps,” an axiom that is built into nearly all dia- 1 Introduction chronic corpus linguistics (see Hilpert and Gries, 2014 for a state-of-the-art survey). However, it is Modern theories of semantic categories, especial- unclear how such ‘semantic fields’ are to be ly those influenced by Cognitive Linguistics identified. Usually, linguists’ intuitions are the (Geeraerts and Cuyckens, 2007), generally con- primary evidence. In contrast to an intuition- sider semantic categories to have an internal based approach, we set out from the idea that structure that is organized around prototypical categories can be extracted from a corpus, using exemplars (Geeraerts, 1997; Rosch, 1973). a ‘bottom up’ methodology. We demonstrate this Historical linguistics uses this conception of by automatically categorizing the entire lexicon semantic categories extensively, both to describe of a corpus, using clustering on the output of a changes in word meanings over the years and to word embedding model. explain th em. Such approaches tend to describe We analyze the resulting categories in light of changes in the meaning of lexical items as the predictions proposed in historical linguistics changes in the internal structure of semantic cat- regarding changes in word meanings, thus egories. For example, (Geeraerts, 1999) hypothe- providing a full-scale quantitative analysis of sizes that changes in the meaning of a lexical changes in the meaning of words over an entire item are likely to be changes with respect to the corpus. This approach is distinguished from pre- prototypical ‘center’ of the category. Further- vious research by two main characteristics: first, more, he proposes that more salient (i.e., more it provides an exhaustive analysis of an entire prototypical) meanings will probably be more corpus; second, it is fully bottom-up, i.e., the cat- resistant to change over time than less salient egories obtained emerge from the data, and are (i.e., less prototypical) meanings. not in any way based on linguists’ intuitions. As Despite the wealth of data and theories about such, it provides an independent way of evaluat- changes in the meaning of words, the conclu- ing linguists’ intuitions, and has the potential to sions of most historical linguistic studies have turn up new, unintuitive or even counterintuitive been based on isolated case studies, ranging from

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

66 119 sociative entropy and the logarithm of corpus fre- facts about language usage, and hence, by hy- Where d is the vector’s dimension length, and W quency: In the case of nouns the correlation is pos- i 1000 pothesis, about knowledge of language. and W ’ represent two specific values at the same r = 0.134 0.367 i itive ( and in the two ages) while vector point for the first and second words, re- the correlation is negative (r = 0.281, 0.222) − − 2 Literature review spectively. 100 for verbs. This peculiar relation would be further Since words with similar meaning have simi- frequency Some recent work has examined meaning change studied with considering morphological entropy in lar vectors, related words are closer to each other in large corpora using a similar bottom-up ap- 10 light of the argument frames of the verbs on the in the semantic space. This makes them ideal for proach and word embedding method (Kim et al., one hand, and the role of syntagmatic associations clustering, as word clusters represent semantic 2014). These works analyzed trajectories of in the associative fields of verbs on the other [4]. ‘areas,’ and the position of a word relative to a 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 meaning change for an entire lexicon, which en- cluster centroid represents its saliency with re- associative overlap 7 abled them to detect if and when each word æ spect to the semantic concept captured by the 6 æ æ ADJ changed, and to measure the degree of such æ æ cluster. This saliency is higher for words that are æ æ æ æ 8 ADJMODN Figure 3: Histogram of pairwise associative over- æ æ 5 æ æ changes. Although these works are highly useful æ ææ æ ææ æ æ æ æ closer to their cluster centroid. In other words, a ææ æ æ æ ADJN æ æ æ ææ æ æ laps (age 10-14) æ æ ææ æ æ æ æ æ for our purposes, they do not attempt to explain 4 æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ 8 AdvN æ ææææ æ word’s closeness to its cluster centroid is a æ ææ æ ææ æ æ æ æ æ æ æ æ æ æ æ why words differ in their trajectories of change 3 æ æææ ææ æ INF æ æ æ æ æ æ æ æ æ æ æ ææ æ æ æ æ æ æ measure of its prototypicality. To test for the op- æ æ æ æ æ æææææ æ æ æ æ æ æ æ æ 8 MOD 10 000 æ æ ææ æ æ by relating observed changes to linguistic param- 2 æ æ ææææ ææ æ æ æ æ æ æ morphological entropy æ æ æ æ æ æ æ timal size of the ‘semantic areas,’ different num- ææ æ æ æ N æ ææ ææ æ æ æ æ æ æ æ eters. 1 æ ææ æææ ææ æ æ æ æ æ æ 8  æ N V bers of clusters were tested. For each the cluster- æ æ æ æ 1000 0 æ æ æ æ æ V Wijaya and Yeniterzi (2011) used clustering to 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 ing procedure was done independently. characterize the nature of meaning change. They associative entropy To quantify diachronic word change, we train were able to measure changes in meaning over 100 a word vector model on a historical corpus in an time, and to identify which aspect of meaning frequency Figure 5: Relation between associative and mor- orderly incremental manner. The corpus was had changed and how (e.g., the classical seman- sorted by year, and set to create word vectors for 10 phological entropy (age 18-24). The blue regres- tic changes known as ‘broadening,’ ‘narrowing,’ each year such that the words’ representations at sion line corresponds to nouns, while the red line and ‘bleaching’). Although innovative, only 20 the end of training of one year are used to initial- corresponds to verbs. clusters were used. Moreover, clustering was 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 ize the model of the following year. This allows only used to describe patterns of change, rather associative overlap a yearly resolution of the word vector representa- than as a possible explanatory factor. tions, which are in turn the basis for later anal- ææ 6 æ ADJ Figure 4: Histogram of pairwise associative over- 10 æ æ æ yses. To detect and quantify meaning change for æ ADJMODN 3 Method æ laps (age 18-24) æ æ æ æ ææææ æ each word-of-interest, the distance between a æ æ ææ æ æ ADJN æ æ æ æ ææ æ 5 æ æ ææ ææ æ æ æ 10 æ æ ææ æ æææ æ æ æ ææ æ ææ æ ææ æ æ ææ AdvN ææ ææ æ æ æ æ æ A distributed word vector model was used to word’s vector in two consecutive decades was æææ æ ææ æ æ æ æ æ ææ æ ææ ææ æ æ æ æ æ æ æ æ æ æ æ ææ æ æ æ ææ ææ æ ææ æ æ æ æ INF frequency æ ææ æ æ æ æ æ æ æææ æ learn the context in which the words-of-interest computed, serving as the degree of meaning 4 æ æ ææææ ææ æ æ æ ææ ææ 10 æ æ æ æ ææ æ MOD æ æ æ æ æ æ æ æ 2.2 Relationship between associative entropy, æ æ æ are embedded. Each of these words is represent- change a word underwent in that time period æ æ ææ æ æ æ æ æ N æ æ æ æ ææ morphological entropy and frequency 1000 æ NV ed by a vector of fixed length. The model chang- (with 2 being maximal change and 0 no change). 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 V es the vectors’ values to maximize the probabil- Having two representational perspectives – We followed the methods introduced by Osgood associative entropy ity in which, on average, these words could pre- synchronic and diachronic – we test the hypothe- [5] for analyzing the variability of morphologi- dict their context. As a result, words that predict sis that words that exhibit stronger cluster salien- cal and associative structure of words. As shown Figure 6: Relation between associative entropy similar contexts would be represented with simi- cy in the synchronic model – i.e., are closer to in figure (5), there is an interesting difference of and corpus frequency (age 18-24). The blue re- lar vectors. This is much like linguistic items in a the cluster centroid – are less likely to change the relations between associative entropy (defined gression line corresponds to nouns, while the red classical structuralist paradigm, whose inter- over time in the diachronic model. We thus as HA = pi log pi, where pi is the rel- changeability at a given point or ‘slot’ in the syn- measure the correlation between the distance of a − i 2 line corresponds to verbs. ative frequency of the ith associated word) and word to its cluster centroid at a specific point in P tagmatic chain implies they share certain aspects corpus morphological entropy (defined as HM = of function or meaning. time and the degree of change the word under- q log q , where q is the relative frequency References The vectors’ dimensions are opaque from a went over the next decade. − i i 2 i i of form i in the MOKK corpus) between nouns linguistic point of view, as it is still not clear how P [1] James Deese. The structure of associations in and verbs. In nouns, the more varied the morphol- to interpret them individually. Only when the full 4 Experiment language and thought. Johns Hopkins Univer- range of the vectors’ dimensions is taken togeth- ogy of the noun is in the corpus, the more variable We used the 2nd version of Google Ngram of sity Press, 1966. er does meaning emerges in the semantic hyper- the associative field is (r = 0.202, 0.175 in the fiction English, from which 10 millions 5-grams space they occupy. The similarity of words is two ages). That can be interpreted as implying that [2]P eter´ Halacsy,´ Andras´ Kornai, Laszl´ o´ were sampled for each year from 1850-2009 to computed using the cosine distance between two the more varied the suffixation of a noun is, the Nemeth,´ Andras´ Rung, Istvan´ Szakadat,´ serve as our corpus. All words were lower cased. word vectors, with 0 being identical vectors, and and Viktor Tron.´ Creating open language Word2vec (Mikolov et al., 2013) was used as more variable associative relations it enters with 2 being maximally different: resources for hungarian. In LREC, 2004. the distributed word vector model. The model other words. In verbs, however, if the verb has a × more varied morphology, it has less associations [3] Zsolt Lengyel. Magyar asszociaci´ os´ normak´ (1) 1 was initiated to 50 dimensions for the word vec- 𝑑𝑑 tors’ representations, and the window size for (r = 0.194, both groups). As figure (6) shows, a enciklopedi´ aja´ I-II. TINTA konyvkiad¨ o,´ Bu- (∑𝑖𝑖=)1 𝑊𝑊×𝑖𝑖 𝑊𝑊′𝑖𝑖 ( ) − − context set to 4, which is the maximum size giv- similar relationship has been obtained between as- dapest, 2008-2010. 𝑑𝑑 2 𝑑𝑑 2 �∑𝑖𝑖=1 𝑊𝑊𝑖𝑖 �∑𝑖𝑖=1 𝑊𝑊′𝑖𝑖

118 67 en the constraints of the corpus. Words that ap- shutters, 0.04 hat, 0.03 A study of relations between associative structure and morphological peared less than 10 times in the entire corpus windows, 0.05 cap, 0.04 structure of Hungarian words were discarded from the model vocabulary. doors, 0.08 napkin, 0.09 Training the model was done year by year, and curtains, 0.1 spectacles, 0.09 versions of the model were saved in 10 year in- blinds, 0.11 helmet, 0.13 Daniel´ Czegel´ Zsolt Lengyel gates, 0.13 cloak, 0.14 tervals from 1900 to 2000. Budapest University of Technology Pannon University, Veszprem´ gallop, 0.02 handkerchief, 0.14 and Economics The 7000 most frequent words in the corpus trot, 0.02 cane, 0.15 were chosen as words-of-interest, representing Table 1: Example for clusters of words using 2000 czegel [email protected] the entire lexicon. For each of these words, the clusters and their distance from their centroids. cosine distance between its two vectors, at a spe- cific year and 10 years later, was computed using Figure 1 shows the analysis of changes in Csaba Pleh´ (1) above to represent the degree of meaning word meanings for the years 1950-1960. We Central European University, Budapest change. A standard K-means clustering proce- chose this decade at random, but the general [email protected] dure was conducted on the vector representations trend observed here obtains over the entire peri- of the words for the beginning of each decade od (1900-2000). There is a correlation between sit from 1900 to 2000 and for different number of The paper mainly aims to reanalyze data with do hammer down the words’ distances from their centroids and the put quick sit clusters from 500 until 5000 in increments of go degree of meaning change they underwent in the the presently available corpus linguistics tools old slow square walk 500. The distances of words from their cluster catch following decade, and this correlation is observ- sight of hold from a relatively large scale paper-and-pencil  newspaper come clock keep hour chair send do enter go see  page centroids were computed for each cluster, using run watch able for different number of clusters (e.g., for away lesson side based Hungarian verbal association dictionary to bed purchase daughter paper money work stand town difficult corner (1) above. These distances were correlated with earnlook forsearch find heavy baby 500 clusters, 1000 clusters, and so on). The posi- male woman doll with regard to two aspects. i) The mental lexicon man short stem littlefootleg sound hand street teach think voice soldier strong high job the degree of change the words underwent in the people village say tall tive correlations (r>.3) mean that the more distal can girl lion ADJ name issue. How are associative overlaps representing know to   look at ADJ MOD N young power place long speak space following ten-year period. The correlation be- to help part big ADJN learn promise group mountain a word is from its cluster’s centroid, the greater understand acquaintance  square house answer boy free great known grandmother simple turn river ADVN ii) to ask love allowed window structural relations in the mental lexicon? The country familiar man like child outdoors INF tween the distance of words from random cen- word excuse father table head roompicture the change its word vectors exhibit the following give have to carpet MOD to hungarian beautiful new must milk systemic variability of the associative fields mo- hear book shape deep MODV needtranquil interesting smooth guest life troids of different clusters, on the one hand, and to dear forest N bad world decade, and vice versa. matter road writerussian friendkind impression law good dear way NV earth eye ocean bilized by the stimulus words: how variable the time result expensive order thief mother clean V the degree of change, on the other hand, served  weather truth mark to school Crucially, the correlations of the distances flower law ticketremember joy green towish besilent dream holiday wrath health year responses are, and how these associative entropies doctor yellow to listen to blue as a control condition. loud physician to dot from the centroid outperform the correlations of music moon point live dark colour problem lamp hungry score white are related to morphological entropies of the same red black the distances from the prototypical exemplar, day sun 4.1 Results to light thirsty sleep bread words. production hard stove which was defined as the exemplar that is the soft drink salt fruit sweet water war Table 1 shows six examples of clusters of words. to morning full closest to the centroid. Both the correlations of sour eat cold illness bitter cinema butter The clusters contain words that are semantically stomach the distance from the cluster centroid and of the 1 Methods and materials saturday similar, as well as their distances from their clus- distance from the prototypical exemplar were ter centroids. It is important to stress that a cen- significantly better than the correlations of the For the associative corpora, two dictionaries of troid is a mathematical entity, and is not neces- Figure 1: Associative field of children (age 10-14) control condition (all p’s < .001 under permuta- Lengyel [3] were used. They are based on the sarily identical to any particular exemplar. We tions tests). responses of 2000 students between 10 14 and suggest interpreting a word’s distance from its − 18 24 to about 200 stimulus words. Digitized square cluster’s centroid as the degree of its proximity − quick part to a category’s prototype, or, more generally, as responses from this dictionary were related to the hammer go chair paper side year frequency distribution of 800 million web-based law run clock way thief street enter a measure of prototypicality. Defined in this hour country lesson production music hungarian sit down square stem find do full Hungarian words from the MOKK corpus [2]. short space daughterbaby way, sword is a more prototypical exemplar than sit stop city doll foot little head mountain heavyhold word name milk come send high ocean law hand spear or dagger, and windows, shutters or doors picture to table cinemaschool village carpet write catch long house window ADJ do group sight of time big world river ADJMODN weather boy room may be more prototypical exemplars of a cover say place voice order ADJ N know book new sound 2 Results think smooth wood work work figure Adv N answer learn result man teach matter interestingsimple free yellow russian beautiful INF of an entrance than blinds or gates. In addition, young eye flower colour promise to child girl allowed walk earth man blue MOD speak toask help outdoors good lamp ticket to be clean red moon understand truthto green N life the clusters capture near-synonyms, like gallop look for to wanting woman live corner to joy N V father like dream sun give hear friend bad 2.1 Associative overlaps and lexical fields lookacquaintance at to dear day V to dark remember thanquil black lion to listen to stand dear strong and trot, and level-of-category relations, e.g., the guest health white to be silent bed light see excuse to mother wish hard news sleep impression money stove newspaper feast bread modal predicates allowed, permitted, able. The buy morning water Based on the associative overlap measure intro- saturday power trouble to deep Figure 1. Change in the meanings of words correlated eat soldier go away sweet slow very fact that the model captures clusters and wrath salt cold duced by Deese [1], a multidimensional scaling grandmother soft with distance from centroid for different numbers of people loud butter full distances of words which are intuitively felt to be drink hungrybitter fruit clusters, for the years 1950-1960. method was used to obtain associative fields de- thirsty doctor semantically closer to or farther away from a cat- war picting the pairwise associative distance of stim- illness old egory prototype is already an indication that the sour stomach In other words, the likelihood of a word ulus words in a two-dimensional figure. The re- model is on the right track. changing its meaning is better correlated with the sults indicate that young adults have a more dense Figure 2: Associative field of young adults (age distance from an abstract measure than with the structure, their associative clusters are more tight 18-24) distance from an actual word. For example, the compared to those of children of age 10-14, as il- sword, 0.06 allowed, 0.02 likelihood of change in the sword-spear-dagger lustrated in figure (1) and (2) and shown quantita- spear, 0.07 permitted, 0.04 cluster is better predicted by a word’s closeness dagger, 0.09 able, 0.06 tively in figure (3) and (4).

Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

68 117 Jonathan Grainger and Arthur M. Jacobs. 1996. David Peeters, Ton Dijkstra, and Jonathan to the centroid, which perhaps could be concep- 5 Conclusion Orthographic processing in visual word rec- Grainger. 2013. The representation and proc- tualized as a non-lexicalized ‘elongated weapon ognition: A multiple read-out model. Psycho- essing of identical cognates by late bilinguals: with a sharp point,’ than its closeness to an actual We have shown an automated bottom-up ap- logical Review , 103: 518-565. RT and ERP effects. Journal of Memory and word, e.g., sword. This is a curious finding, proach for category formation, which was done Language , 68: 315-332. which seems counter-intuitive for nearly all theo- on an entire corpus using the entire lexicon. Jonathan Grainger and Ton Dijkstra. 1992. On ries of lexical meaning and meaning change. We have used this approach to supply histori- the representation and use of language infor- Walter J. B. Van Heuven, Ton Dijkstra, and Jon- The magnitude of correlations is not fixed or cal linguistics with a new quantitative tool to mation in bilinguals. In R. J. Harris (ed.), athan Grainger. 1998. Orthographic neigh- randomly fluctuating, but rather depends on the test hypotheses about change in word meanings. Cognitive Processing in Bilinguals , 207-220. borhood effects in bilingual word recognition. Our main findings are that the likelihood of a number of clusters used. It peaks for about 3500 Amsterdam: Elsevier. Journal of Memory and Language , 39: 458- clusters, after which it drops sharply. Since a word’s meaning changing over time correlates 483. larger number of clusters necessarily means with its closeness to its semantic cluster’s most Noriko Hoshino and Judith F. Kroll. 2008. Cog- smaller ‘semantic areas’ that are shared by fewer prototypical exemplar, defined as the word clos- nate effects in picture naming: Does cross- words, this suggests that there is an optimal est to the cluster’s centroid. Crucially, even bet- language activation survive a change of range for the size of clusters, which should not ter than the correlation between distance from script? Cognition , 106: 501-511. be too small or too large. the prototypical exemplar and the likelihood of change is the correlation between the likelihood Olessia Jouravlev and Debra Jared. 2014. Read- 4.2 Theoretical implications of change and the closeness of a word to its clus- ing Russian–English homographs in sentence One of our findings matches what might be ex- ter’s actual centroid, which is a mathematical contexts: Evidence from ERPs. Bilingualism: abstraction. This finding is surprising, but is Language and Cognition , 17: 153-168. pected, based on Geeraert’s hypothesis, men- tioned in Section 1: a word’s distance from its comparable to the idea that attractors, which are Margarita Kaushanskaya and Viorica Marian. cluster’s most prototypical exemplar is quite in- also mathematical abstractions, may be relevant 2007. Bilingual Language Processing and In- formative with respect to how well it fits the for language change. terference in Bilinguals: Evidence From Eye cluster (Fig. 1). This could be taken to corrobo- Tracking and Picture Naming. Language rate Roschian prototype-based views. However, Learning , 57: 119-163. another finding is more surprising, namely, that a Acknowledgements word’s distance from its real centroid, an abstract We thank Daphna Weinshall (Hebrew University Jeesun Kim and Chris Davis. 2003. Task effects average of the members of a category by defini- of Jerusalem) and Stéphane Polis (University of in masked cross-script translation and phono- tion, is even better than the word’s distance from Liège) for their helpful and insightful comments. logical priming. Journal of Memory and Lan- the cluster’s most prototypical exemplar. All errors are, of course, our own. guage , 49: 484-499. In fact, our findings are consonant with recent work in usage-based linguistics on attractors, Reference Kristin Lemhöfer and Ton Dijkstra. 2004. Rec- ‘the state(s) or patterns toward which a system is ognizing cognates and interlingual homo- drawn’ (Bybee and Beckner, 2015). Importantly, graphs: Effects of code similarity in language Joan Bybee. 2010. Language, usage and cognition. attractors are ‘mathematical abstractions (poten- Cambridge: Cambridge University Press. specific and generalized lexical decision. tially involving many variables in a multidimen- Memory and Cognition , 32: 533-550. sional state space)’. We do not claim that the Joan Bybee and Clay Beckner. 2015. Emergence at Viorica Marian and Michael Spivey. 2003. Bi- centroids of the categories identified in our work the cross linguistic level. In B. MacWhinney and W. O'Grady (eds.), The handbook of lingual and monolingual processing of com- are attractors – although this may be the case – but rather make the more general point that an language emergence, 181-200. Wiley peting lexical items. Applied Psycholinguis- Blackwell. tics , 24: 173-193. abstract mathematical entity might be relevant for knowledge of language and for language Dirk Geeraerts. 1997. Diachronic prototype Koji Miwa, Ton Dijkstra, Patrick Bolger, and change. In the domain of meaning change, the fact that semantics. A contribution to historical Harald R. Baayen. 2014. Reading English lexicology. Oxford: Clarendon Press. with Japanese in mind: Effects of frequency, words farther from their cluster’s centroid are more prone to change is in itself an innovative phonology, and meaning in different-script Dirk Geeraerts. 1999. Diachronic Prototype result, for at least two reasons. First, it shows on bilinguals. Bilingualism: Language and Cog- Semantics. A Digest. In: A. Blank and P. Koch nition , 17: 445-463. unbiased quantitative grounds that the internal (eds.), Historical semantics and cognition. structure of semantic categories or clusters is a Berlin & New York: Mouton de Gruyter. Kimberley Mulder and Ton Dijkstra (under revi- factor in the relative stability over time of a sion). Revisiting the neighbourhood: The pro- word’s meaning. Second, it demonstrates this on Dirk Geeraerts, and Hubert Cuyckens (eds.). 2007. cessing of cross-language hermits and neigh- the basis of an entire corpus, rather than an indi- The Oxford handbook of cognitive linguistics. bours in different tasks. Bilingualism: Lan- vidual word. Ideas in this vein have been pro- Oxford: Oxford University Press. guage & Cognition . posed in the linguistics literature (Geeraerts, 1997), but on the basis of isolated case studies Dirk Geeraerts, Caroline Gevaerts, and Dirk which were then generalized. Speelman. 2011. How Anger Rose: Hypothesis

116 69 Testing in Diachronic Semantics. In J. Finally, in contrast to prediction 5, English and Dijkstra (under revision). The present study Robynson and K. Allan (eds.), Current control words with mismatching orthography provides confirmation for these models from a methods in historical semantics, 109-132. were not processed more quickly than control completely independent perspective, that of Berlin & New York: Mouton de Gruyter. words with ambiguous orthography. Apparently, cross-linguistic similarity effects in scripts. mismatching orthography in general did not re- To conclude, we presented evidence in favor Martin Hilpert. 2006. Distinctive Collexeme Analysis sult in any systematic interference on word proc- of language non-selective lexical access in Rus- and Diachrony. Corpus Linguistics and essing speed. Said differently, the noise intro- sian-English bilinguals, showing an English- Linguistic Theory, 2 (2): 243–256. duced by spuriously activated word candidates Russian cognate facilitation effect, the size of from Russian with overlapping letters in the oth- which depended on whether there was overlap in Martin Hilpert and Stefan Th. Gries. 2014. er control conditions did not systematically affect orthography or not, and on whether this overlap Quantitative Approaches to Diachronic Corpus Linguistics. In M. Kytö and P. Pahta (eds.), The the lexical decision to the English target word, was ambiguous or transparent relative to phonol- Cambridge Handbook of English Historical although it may have affected the participants’ ogy. These effects were shown to be lexical in Linguistics. Cambridge: Cambridge University general decision-making strategies in the ex- nature, because mismatching orthography in con- Press, 2014. periment. In terms of interactive activation mod- trol target words with translations that are com- els, the increase in noise could be cancelled out pletely different in form did not show any evi- Yoon Kim, Yi-I Chiu, Kentaro Haraki, Darshan by a somewhat higher reliance on semantic codes dence of differential processing. Hegde, and Slav Petrov. 2014. Temporal or global lexical activation (Grainger and Jacobs, Analysis of Language through Neural Language 1996) for making the lexical decision. Acknowledgments Models. Proceedings of the ACL 2014 In all, the obtained patterns of results are in This research was made possible with support Workshop on Language Technologies and support of interactive activation models for bi- from NetWordS, the «European Network on Computational Social Science, 61-65. lingual word recognition, such as the BIA+ mod- word structure in the languages in Europe» (re- Baltimore, USA. el (Dijkstra and Van Heuven, 2002) when the search grant n° 09-RNP-089). The authors are assumption is made that cognates are represented also deeply indebted to the EPSRC's RefNet re- Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. in terms of overlapping but lexically competing 2013. Linguistic Regularities in Continuous search network that enabled us to collect a large form representations and largely shared semantic Space Word Representations. Proceedings of part of our data. We also wanted to thank the representations in the two languages (Dijkstra, NAACL-HLT 2013: 746–751. Atlanta, Georgia. anonymous reviewers for their helpful comments Miwa et al., 2010), see Figure 1. Even the on an earlier version of this paper. Eleanor H. Rosch. 1973. Natural Categories. somewhat counter-intuitive prediction 4 can find a reasonable explanation in terms of such mod- Cognitive Psychology 4 (3): 328–350. References els. Prediction 5 was not confirmed, but the actu- Eyal Sagi, Stefan Kaufmann, and Brady Clark. 2011. ally obtained result can be interpreted in terms of Eriko Ando, Kazunaga Matsuki, Heather Sheri- Tracing semantic change with latent semantic slightly shifted lexical decision criteria. dan, and Debra Jared. 2015. The locus of Ka- analysis. In K. Allan and J.A. Robinson (eds.), This study confirms the presence of language takana-English masked phonological priming Current methods in historical semantics, 161- non-selective lexical access in visual word rec- effects. Bilingualism: Language and Cogni- 183. Berlin & New York: Mouton de Gruyter. ognition by different script-bilinguals, in line tion , 18: 101-117. with, e.g., for Korean-English Kim and Davis Ton Dijkstra, Béryl Hilberink-Schulpen, and Derry T. Wijaya and Reyyan Yeniterzi. 2011. (2003) and for Japanese-English Hoshino and Walter J. B. Van Heuven. 2010. Repetition Understanding semantic change of words over Kroll (2008), Miwa et al. (2014), and Ando et al. and masked form priming within and between centuries. In Proceedings of the 2011 (2015). Moreover, it bridges research on shared languages using word and nonword neigh- international workshop on DETecting and scripts and different scripts by considering the bors. Bilingualism: Language and Cognition , Exploiting Cultural diversiTy on the social web partially overlapping Latin and Cyrillic scripts of (DETECT ’11) 35-40. Glasgow, United 13: 341-357. Kingdom. English and Russian. It is innovative in showing that cross-linguistic effects depend on the degree Ton Dijkstra, Koji Miwa, Bianca Brummelhuis, of overlap in scripts depending on the exact Maya Sappelli, and Harald R. Baayen. 2010. characteristics of the words involved. How cross-language similarity and task de- The study also provides indirect support for mands affect cognate recognition. Journal of various types of models that assume co- Memory and Language , 62: 284-301. activation of word candidates that are ortho- graphically similar to the input letter string. The Ton Dijkstra and Walter J. B. Van Heuven. 2002. set of such candidates is often referred to as the The architecture of the bilingual word recog- neighbourhood (Grainger and Dijkstra, 1992). nition system: From identification to decision. Van Heuven et al. (1998) have shown that the Bilingualism: Language and Cognition , 5: number of neighbours within and between lan- 175-197. guages affects bilingual word recognition. This result has recently been confirmed by Mulder

70 115 Condition RT dif- fect is also observed in cognates with (partially) Cognates Controls What NN compounding in child language tells us about categorization Type ference mismatching orthography, the cognate effect 661 (82.2) 727 (112.7) may in part be ascribed to the phonological and Base 66 .97 .95 semantic overlap in these cognates. Thus, the

711 (105.7) 734 (106.4) orthographic input representation quickly leads Minus 23 Maria Rosenberg Ingmarie Mellenius .94 .93 to an activation of sublexical and lexical phono- Dept. of Language Studies Dept. of Language Studies 656 (89.01) 730 (113.1) logical representations (cf. Peeters et al., 2013). Plus 74 Umeå university Umeå university .97 .92 In line with prediction 3, the cognate facilita- tion effect is modulated by the degree of shared SE-901 87 Umeå SE-901 87 Umeå Table 1. Mean reaction times and accuracies for transparent overlap between Russian and English SWEDEN SWEDEN word categories (standard deviations between alphabets. Cognates with transparent orthogra- [email protected] [email protected] parentheses). phy were processed faster than cognates with

ambiguous grapheme to phoneme mappings.

The word data were analyzed by means of a This finding can be explained by assuming that repeated-measures Analysis of Variance (ANO- Russian words are co-activated with English VA), using cognate type (3, MO vs. AO vs. TO) words to the extent that they match the English perspectives on objects (cf. Waxman and Mar- and cognate status (2, cognate vs. control) as letter input, irrespective of whether this matching 1 Introduction kow, 1995). “Do their [children’s] categories is in terms of block letters or handwritten visual reflect only what their language offers, or do within-subject factors. This analysis resulted in The present study examines novel NN com- similarity. In other words, it is purely a bottom- they – must they– make use of other representa- main effects of Cognate Status (F (1, 27) = pounds, produced on line, in Swedish child lan- up (signal-driven) effect. tions too?” (Clark, 2004:472). 94.11, p<.001), Item Type (F (2, 54) = 9.89, guage, with focus on categorization. Given that Berman (2009) emphasizes that there is a sub- p<.001), and an interaction of Cognate Status NN compounds denote objects, we concentrate stantial difference in adults’ vs. children’s lexi- with Item Type (F (2, 54) = 10.22, p<.001). on the categories those objects belong to. In that cons of established compounds, and that children Next, we did planned comparisons to test the way, our study aims to provide evidence of ob- have to grasp inter alia the idea of subcategoriza- Cognate Minus (CMO) and Cognate Plus (CTO) ject categorization in preschool children. Two tion. Clark and Berman claim that “knowledge of conditions against the Cognate Base (CMO) questions are put forward: the pertinent lexical items, and not the construc- condition. Significant differences were found (i) Does perception play a crucial role for the tions they appear in, is more important for [chil- between the RTs between the Cognate Base con- children’s coinages? dren’s] compounding” (1987:560). dition and the Cognate Minus condition (t(27)=- (ii) In what way do structural and processing In conceptual development, category struc- 5.0, p<.001 two-tailed) but not between the Cog- views on categorization apply to the data? tures change with age (Keil and Kelly, 1987). nate Base and the Cognate Plus condition Swedish children produce compounds already Object categorization allows generalization over (t(27)=.60, p=.55). There was a significant dif- at age two, reflecting the fact that compounding Figure 1. Localist connectionist illustration of properties of objects and of novel category ference between the Cognate Base condition and is a productive word formation device. In short, cognate representation and processing, adapted members (Mandler, 2000). the Control Base condition (t(27)=-6.54, p<.001). Swedish compounds are right-headed, written as from Dijkstra, Miwa et al. (2010). Bornstein and Arterberry (2010) mention two Finally, no significant differences arose between one word, pronounced with a two-peak- complementing views of categorization: pro- the different control conditions (Control Base vs. intonation, and can exhibit liaison forms. Control Minus, t(27)=-.67, p=.51; Control Base The finding that cognates with mismatching cessing and structural. On the processing view, orthography and shared orthography with trans- vs. Control Plus t(27)=-.36, p=.72). 2 Theoretical background categories are flexible and category membership parent grapheme-to-phoneme mappings are re- of objects can vary in different situations (cf. e.g. 5 Discussion sponded to about equally fast, is in line with pre- Clark (2004) argues that language acquisition Jones and Smith, 1993). On the structure view, diction 4, which is based on the representation builds upon already established conceptual in- categories are hierarchically organized taxono- Russian-English bilinguals performed an English for cognates that has been proposed by Dijkstra, formation, which enables the child to categorize mies (cf. e.g. Murphy, 2002). Instead of Rosch’s lexical decision task with purely English control Miwa et al. (2010). As Figure 1 indicates, both objects, relations and events. Children rely main- (1978) superordinate-basic-subordinate, levels of words and English-Russian cognates 1) with form representations of cognates are assumed to ly on shape as they embark on the mapping of category inclusiveness can be ordered in a neu- mismatching orthography or 2) shared orthogra- be activated based on the input and they spread words for objects onto their conceptual catego- tral way, such as L1 (animal), L2 (cat, dogs), L3 phy with a) transparent or b) ambiguous map- activation to convergent semantic representa- ries of objects, but also pay attention to texture, (collies, shepherds), L4 (scotch collies, border pings on phonemes in Russian and English. tions. The co-activation of form representations size, sound, motion and function. Even into collies) (Bornstein and Arterberry, 2010:3). Responses to cognates were faster than to results in lexical competition and interference adulthood, children continue the mapping of un- Whether categorization proceeds from con- English controls (see Table 1). This cognate fa- (Dijkstra, Hilberink-Schulpen et al., 2010), known linguistic items onto conceptual represen- crete to abstract or the other way around is still cilitation effect is in line with prediction 1 that whereas the convergence on semantics results in tations. Young children occasionally form emer- under debate. Differentiation theory (e.g. Gibson, lexical candidates in both Russian and English facilitation. As a result, the RT difference be- gent categories, based on non-conventional dis- 1969) stipulates that the ability to make finer are activated during Russian-English bilingual tween cognates with mismatching orthography tinctions (e.g. ball for round things). Clark differentiations emerges after broad conceptions word recognition. and shared transparent orthography may be rela- (2004) notes that the pairing of word to object are acquired. Likewise, Bornstein and Arterberry It also confirms prediction 2 that language tively small, due to a cancelling out of the effects enables the child to perceive similarities between (2010) indicate that more inclusive levels of cat- non-selective lexical access takes place in Rus- of increased lexical form competition and in- cognitive categories, and allows for alternate egorization appear before less inclusive ones, sian-English word recognition. Because the ef- creased semantic co-activation. Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

114 71 and that high perceptual contrasts have prece- 4.1 N1 and N2 sorting transparency will lead to increased semantic co- indicated that bilinguals considered not only dence over low. Fisher (2011) suggests that at activation of cognates in the two languages. block-letters but also corresponding handwritten The sorting of N1 and N2 shows that several age 3-5, perceptual information is anchored more 5. English control words with mismatching or- graphemes when rating the visual similarity be- nouns reoccur in the children’s compounds. 126 strongly than conceptual information; cognitive thography will be processed more quickly than tween words. N1 of the 383 compounds were either identical flexibility develops with age. words with ambiguous orthography, because less In total, 37 Russian-English participants (10 or belonging to the same morphological family, Yet, according to Smith (1984), preschool interference from the Russian alphabet is ex- male vs. 27 female; age: 19-60 years) took part such as morotsvatten ‘carrot-s-water and moröt- children show the ability of both concrete cate- pected in the first case. in the study. At the moment of testing, all par- termacka ‘carrots-sandwich’. With respect to N , gorization, due to perceptual characteristics, and 2 ticipants were residing in English-speaking coun- this number was as high as 143. abstract categorization, leaning on conceptual 3 Method tries: 11 participants in Bristol, UK, 21 partici- The largest morphological family found in our relationships. Nguyen and Murphy (2003) posit pants in Sheffield, UK, and 5 participants in New data contains vatten ‘water’. 12 compounds are To test these hypotheses, we first constructed a three categorization forms: taxonomic (see Zealand. After the experiment, all participants attested (7 compounds from one child, whereof 4 large database of Russian-English cognates with above), script and thematic. Script-based catego- rated their proficiency in English on a scale from have vatten as N , and 3 as N ). The two other three, four, five or six letters in length. To our ries include objects (e.g. egg, cereal) with the 1 2 1 (the lowest) to 6 (the highest). Average ratings children used vatten in 4 and 1 instances respec- knowledge, no such database is currently avail- same functional role in a routine event (e.g. eat- for reading, writing, speaking, and listening var- tively, such as vattenkaffe ‘water-coffee’ or the able to the community of researchers. Next, 75 ing breakfast). Thematic categories involve ob- ied between 4.4 and 5. Except for two partici- aforementioned morotsvatten ‘carrot-s-water’. English cognates were selected as test words in a jects that usually appear together (e.g. bowl- pants, ages of L2 acquisition (AoA) ranged be- Other nouns that reoccurred nearly ten times lexical decision task. Orthographic coding was cereal). Nguyen and Murphy (2003) show that tween 6 and 19 years. Length of residence in an among the innovations of all three children were performed on English cognate words written in children, aged 3 to7, use taxonomic and script English-speaking country varied between 3 bil ‘car’ kläder ‘clothes’, mamma ‘mommy’ and lower-case block letters in Arial font. The result- categorization in a flexible way. months and 21 years (mean = 33 years, SD = 11 väg ‘road’ (cf. 4.6). ing items were allocated to three categories: 1) years). It is worth noting that although the same Cognates with Ambiguous Orthography 3 Data and method Participants performed an English lexical de- nouns were used in several compounds, they did (CAO=Minus condition), composed of letters cision task, in which they pressed a “yes” or a not always uphold the same relation to the other that have different phonological mappings in The data consists of 383 spontaneously produced “no” button depending on whether a presented constituent: pizzabil ‘pizza-car’ was used for a English and Russian (e.g. ‘guru’ might be read as NN compounds from three monolingual Swedish word was English or not. They were asked to car with a pizza print on it (viz. perceptually), /digi/ if a Russian monolingual was asked to read children, aged 1 to 6, collected longitudinally press a button as quickly and accurately as possi- whereas dimbil ‘fog-car’ referred to an imagina- this string of letters); 2) Cognates with Transpar- and including contextual information. The chil- ble. The items were presented in a pseudo- tive car spraying fog (viz. abstractly). ent Orthography (CTO=Positive condition), dren often give an explanation of the intended randomized order to each participant. The ex- Overall, the overlap between the same nouns composed of letters that largely share their or- meaning, e.g. hundstall ‘dog-stable’, ‘where periment was programmed in E-Prime. Reaction being used in several compounds and as first or thographic-phonological mappings with letters of dogs live, outside’. Hence, they seem to under- times (RTs) and accuracy of responses were second constituent, can be taken as support for the Russian alphabet (e.g. in ‘koala’ the only stand the semantics of their novel compound. We measured. Only correct responses to real words Clark’s and Berman’s (1987) claim (cf. 2) that mismatch with the Russian alphabet is the graph- use a strict selection criterion: only non- were included in the analyses of reaction times. children use lexical items that they are familiar eme ‘l’); 3) Cognates with Mismatching Orthog- established compounds in contemporary Swedish are considered. with in their compounding. raphy (CMO=Base condition), composed mostly 4 Results As a first step to analyze our data, we sort the of letters that do not exist in the Russian alphabet 4.2 Level of inclusiveness compounds in two ways: (i) based on N1; (ii) (e.g. ‘filter’). The cognate types were matched First, all responses faster than 300 ms and slower based on N2. This is a way of locating items be- As for the level of inclusiveness, the compounds across conditions (CAO/CTO/CMO) in word than 3 s were removed from the data set, because longing to a same morphological family (cf. in our data are situated on L1 (björkgrej ‘birch- length, frequency, and degree of cross-linguistic they were not considered as valid measurements. orthographic overlap between Russian and Eng- Next, the data from 9 participants were excluded Schreuder and Baayen, 1997). As a second step, thing’), L2 (brödrosta ‘toaster’), L3 (äppelsvans the data is analyzed according to: (iii) level of ‘apple-tail’) or L4 (hjärtklackskorna ‘heart-heel- lish alphabets. Three groups of control words from analysis, because they had a response accu- inclusiveness; (iv) script; (v) thematicity; (vi) shoes), with L3 as the predominant level. If we were then selected that matched the cognates of racy below 70%. We removed 5 cognates, 8 con- each type with respect to these three dimensions. trol words, and 14 non-words from the items, perception (real-world referent or not, high con- look only at N1 or N2 in isolation, they can also trasts vs. low). As a third step, other characteris- correspond to items located at L1 (djur ‘animal’, Finally, each cognate and non-cognate was because these items had an accuracy below 70 % tics appearing from the children’s compounds are L2 (björn ‘bear’), or L3 (äppeljuice ‘apple-juice’) matched with a pseudo-word generated with the or had extremely slow responses. For the remain- analyzed. in three-psart compounds. help of the Wuggy-software (crr.ugent.be). ing 28 participants, after removing these items, Moreover, there are some compounds in our Next, 20 Russian-English bilinguals were cognate and control word conditions were still 4 Analysis data containing a taxonomic relation between the asked to rate the visual similarity between the matched with respect to length and frequency (as constituents: two examples are ugglafågel ‘owl- English cognates and their Russian translation shown by non-significant t-tests). None of the In the analysis we provide evidence of categori- bird’ and skinndjur ‘skin-animals’. equivalents. They also rated the semantic simi- remaining responses were further apart than 2.5 zation concerning larger groups of compounds. larity of all selected item pairs. Rating results SDs from the participant mean in each condition. Below follows some preliminary findings. Note 4.3 Script-based categories showed that bilinguals mostly considered ortho- The mean RT for non-words was 892 ms. Table that the compounds can be analyzed according to graphic congruence (as opposed to incongru- 1 presents the mean RTs for words in each cog- different parameters and, thus, some of them go Entire sets of the compounds can be analyzed as having the same role with respect to a script, in ence) between the orthography of Russian and nate and control word condition, as well as their into several labels, depending on the parameter English translation equivalents and gave higher accuracy. taken under account. which the compounds fulfill the same part. All three children categorize clothes according to ratings to English words that have shared orthog- season or weather, as indicated by N1: sommar- raphy with the Russian alphabet. Ratings also

72 113 Are you reading what I am reading? vantar ‘summer-gloves’, snöstrumpor ‘snow- redundancy, is one way to arrive at overcategori- stockings’ or vinterficka ‘winter-pocket’. zation, as we see it. For instance, kogräs ‘cow- The impact of contrasting alphabetic scripts on reading English There are also compounds in our data where grass’ denotes ‘ordinary grass, that cows eat’ N1 and N2 participate in the same scripts that according to one child. Additionally, an ordinary Tatiana Iakovleva Anna E. Piasecki Ton Dijkstra concern different types of edibles: grötmjölk car is referred to as motorbil ‘motor-car’, or CNRS, France UWE Bristol Radboud University ‘porridge-milk’ (eating breakfast”) or pizzaham- handfinger ‘hand-finger’ is used instead of just 59, rue Pouchet Coldharbour Lane Montessorilaan 3 burgare ‘pizza-hamburger’ (eating dinner”) or finger for the body part. In these three examples, 75017 Paris Bristol BS16 1QY 6500 HE Nijmegen saftglass ‘syrup-ice cream’ (eating dessert). N2 alone would have been the target like word to tatiakovleva@ Anna.Piasecki@ T.Dijkstra@ use, but the children limit its use further. 4.4 Thematic categories yahoo.fr uwe.ac.uk donders.ru.nl A quite odd categorization made by all three Thematic categories, items with close semantic children, independently, is to add the goal of a association based on, e.g., contiguity, are numer- direction to the direction: kalasväg ‘party-road’ ous within the compounds. An example is or mormorväg ‘granny-road’. Recall that väg

1 Introduction vergence and divergence in Russian and English häxafiskspö ‘witch-fish-wand’, where the child ‘road’ was one of the nouns that reoccurred fre- script coding for cognates and non-cognates. aims at a wand used by a witch, but confuses quently among the novel compounds (cf. 4.1). This study examines the impact of the cross- Cognates are translation equivalents with signifi- linguistic similarity of translation equivalents on trollspö ‘magic-wand’ with fiskespö ‘fishing- Hence, the three children seem to find it im- cant cross-linguistic form overlap in phonology pole’, and than adds the user of the item in ques- portant to name particular roads. word recognition by Russian-English bilinguals, and/or orthography (e.g., ‘marriage’ in English, who are fluent in languages with two different tion (actually a case of “overcategorization”, cf. Furthermore, nearly 20 of the children’s com- ‘mariage’ in French). Cognates are generally 4.6). pounds contain one of the words mamma but partially overlapping writing systems. Cur- processed more quickly by bilinguals than Several themes are found. One is “sweets”, ‘mommy’, pappa ‘daddy’ or bebis ‘baby’ as N1 rent models for bilingual word recognition, like matched control words (for an overview of stud- BIA+, hold that all words that are similar to the giving rise to numerous compounds, semantical- or N2, such as mammfluga ‘mommy-fly’, ies, see Dijkstra, Miwa et al., 2010). However, as ly associated or not, such as silvergodis ‘silver- fågelpappa ‘bird-daddy’ or bebismyra ‘baby- input letter string are activated and considered far as we know, cognate processing for the Rus- for selection, irrespective of the language to candy’ and godisstrumpor ‘candy-stockings’. ant’. All three children coined such compounds, sian-English language pair has not been exam- Most of the thematic categorization found in which we interpret as a kind of emergent catego- which they belong (Dijkstra and Van Heuven, ined before. 2002). These activation models are consistent the children’s innovation is abstract and ground- rization, as well as of overcategorization. There were two types of relations involved in these with empirical data for bilinguals with totally 2 Predictions ed in conceptual information. Furthermore, the different scripts, like Japanese and English (Mi- thematic relations are mostly of an inherent na- compounds: animals or insects subcategorized wa et al., 2014). Little is known about the bilin- We are making the following predictions about ture, such as manifested by djungelträd ‘jungle- according to human kinship terms as in the pre- gual processing of Russian and English, but stud- English word recognition by Russian-English tree’, rather than temporal, such as fotbollsplanet ceding examples; mommy or daddy subcatego- ies indicate that the partially distinct character of bilinguals: ‘football-planet’. rized according to some habit, such as ciga- 1. In English word processing, Russian-English rettpappa ‘cigarette-daddy’. the Russian and English scripts does not prevent 4.5 Perception bilinguals will activate lexical candidates that are co-activation (Jouravlev and Jared, 2014; Marian 4.7 Ad hoc categorization and Spivey, 2003; Kaushanskaya and Marian, similar to the input word in both Russian and Compounds categorized according to Shape are 2007). English (language non-selective lexical access). attested, such as R-paprika ‘a piece of paprika Barsalou (1983) uses the label ad hoc categories Many Russian-English translation equiva- 2. English-Russian cognates will be recognized that looks like a R’, or mössaboll ‘hat crumpled for categories constructed on the spot to achieve lents are in part composed of shared letters that more quickly than English control words, due to into the shape of a ball’. Shape may concern ei- certain goals, such as “things to sell at a garage can potentially activate both Russian and English co-activation and convergence (cognate facilita- ther the head or the non-head of the compound. sale”. These categories are much less established word candidates. Often, these letters have am- tion effect, Dijkstra, Miwa et al., 2010; Lemhöfer Texture is involved in many of the children’s in memory than common categories. We inter- biguous phonemic mappings across the two lan- and Dijkstra, 2004). compounds, such as: pälsmatta ‘fur-carpet’. pret ad hoc categories to encompass compounds guages. The degree of ambiguity is high espe- 3. Cognates with ambiguous orthography, i.e. Prints are also a frequent way to distinguish such as Downing’s (1977) “apple-juice seat”, cially when shapes of block-letters and letters in shared letters mapping onto different phonemes among clothes they want to wear, or vehicles that and also the examples from Clark, Gelman and italics overlap across languages. For instance, a in the two languages, will be processed more they see, such as the above-mentioned “pizza- Lane (1985) claimed to involve a temporal rela- printed Russian letter ‘ и’ does not look like any slowly than cognates with mismatching orthog- car”. tion, in contrast to compounds with inherent rela- letter of the English alphabet, but the shape of its raphy, due to decreased facilitation from the oth- Yet, note that many of the children’s coinages, tions. According to Clark, Gelman and Lane handwritten equivalent ‘u’ perfectly coincides er cognate member. which involve perception, can do so in an imagi- (1985), children would more often use novel with the English hand-written grapheme. We The following two predictions are more nary way, or in other words, as mental imagery. compounds to express inherent relations among identified 5 overlapping pairs of printed English speculative and exploratory in nature. A compound, such as champagnetröja ‘cham- objects. The opposite stand is taken by Mellenius block-letters and Russian letters in italics (g, r, 4. Response times to cognates with transparent pagne-sweater’, was uttered to denote a non- (1997), supported by Berman (2009), who claims m, n, u). orthography, i.e. shared letters mapping onto the existent sweater that the child just dreamt up children’s novel compounds are “highly ‘con- Our study started from the assumption that same phonemes in the two languages, will be when playing. text-dependent’ and hence more likely to express even when a bilingual reads English words in about equal to those for cognates with mismatch- temporary rather than intrinsic relations” 4.6 Overcategorization printed font, letter shapes also activate handwrit- ing orthography, because transparent orthogra- (2009:311). ten Russian letters with similar shapes in a bot- phy and shared phonology will lead to increased We will use the term “overcategorization” to la- Some innovations in our data can be analyzed tom-up way. We focused on the impact of con- lexical competition, but, at the same time, the bel some striking features among the children’s as ad hoc instances that the children coin sponta- compounds. Underextension, often involving neously without a real naming demand. They are

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage . Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org 112 73 typically difficult to understand, or does not Analyses. Developmental Psychology, 46(2):350- (immediate priming). As Laudanna et al. (2004) Luigi Talamo, Pier Marco Bertinetto, and Chiara make sense, outside the context of the utterance. 365. have shown for verbal inflection, the effect of Celata C. (submitted) DERIVATARIO: An annotated lexicon of Italian derivatives. An example is one child in the data that invents a Eve V. Clark. 2004. How Language Acquisition complex morphological properties on the pro- triplet of compounds with glass ‘ice cream’ with Builds on Cognitive Development. TRENDS in cessing of isolated words is more likely to be the goal “things that could possibly constitute ice Cognitive Sciences, 8(10):472-478. detected in off-line techniques, such as free recall Appendix A. Scale of morphotactic trans- cream”: träglass ‘wood-ice cream’, sockerglass tasks, implying a short-term and/or episodic parency. Eve V. Clark and Ruth A. Berman. 1987. Types of ‘sugar-ice cream’ and glassögon ‘ice cream- Linguistic Knowledge: Interpreting and Producing memory component. DEGREE NATURE OF PHENOMENON eyes’; the latter denotes, according to the child, Compound Nouns. Journal of Child Language, In addition, the assumed difference between mt1 none mt2 purely prosodic and phonological ‘eye-glasses but made of ice cream (glass) in- 14(3):547-567. transparent and partially opaque derivatives in (e.g. resyllabification, assimilation) stead of glass (glas)’. Another example is priming their base forms might surface to a larg- Eve V. Clark, Susan A. Gelman and Nancy M. Lane. mt3 phonological, with neutralization of phonetic constitu- kungtröja ‘king-sweater’, coined on the spot er extent when the morphological condition is ents (e.g. flapping) 1985. Compound Nouns and Category Structure in when playing: ‘if you wear that sweater you will compared with a phonological priming condition mt4 morpho-phonological, without loss of constituents Young Children. Child Development 56(1):84-94. (e.g. articulatory weakening) be the king’. (e.g. colazione/colare ‘breakfast/percolate’), in mt5 morpho-phonological, with loss of constituents However, our data points in the direction that Pamela Downing. 1977. On the Creation and Use of which no morphological relatedness is found (e.g. deletion) the children’s innovations more often express English Compound Nouns. Language 53(4):810- between the prime and the target, although their mt6 purely morphological 842. (e.g. paradigmatic alternation of affixes) inherent relations than temporal relations, but formal relationship is the same as in a morpho- mt7 lexical: weak suppletion this issue certainly merits further investigation. Anna V. Fisher. 2011. Processing of Perceptual In- logically related pair (e.g. formazione/formare mt8 lexical: strong suppletion formation Is More Robust than Processing of Con- ‘formation/form’). This hypothesis is currently 5 Conclusion ceptual Information in Preschool-Age Children: under investigation. Evidence from Costs of Switching. Cognition, Appendix B. Experimental words. The study provides evidence of on-line categori- 119(2):253-264. zation based on spontaneous production of novel Mt1 Mt4 Eleanor J. Gibson. 1969. Principles of Perceptual References Prime Target Prime Target NN compounds from three Swedish children. Learning and Development. New York: Appleton- disegnatore disegnare traducibile tradurre Pier Marco Bertinetto, Cristina Burani, Alessandro Compared to experimental situations, limited by Century-Crofts. bruciatore bruciare discutibile discutere the material used and the children’s will and en- Laudanna, Lucia Marconi, Daniela Ratti, Claudia suggeritore suggerire tessitura tessere ergy to participate, our collection of data is Stevan R. Harnad (ed). 1987. Categorical Perception: Rolando, and Anna Maria Thornton. 2005. Corpus cancellazione cancellare competitore competere The Groundwork of Cognition. Cambridge: Cam- e Lessico di Frequenza dell’Italiano Scritto (CoL- esclamazione esclamare emettitore emettere unique. It shows that high contrast perceptual bridge University Press. FIS). http://linguistica.sns.it/CoLFIS/Home.htm dominazione dominare roditore rodere features give rise to much subcategorization, nuotatore nuotare scommettitore scommettere however not at the expense of conceptual subcat- Susan S. Jones and Linda B. Smith. 1993. Place and Wolfgang U. Dressler 1985. On the Predictiveness of accentuzione accentuare perseguibile perseguire bollitura bollire godibile godere egorization, equally important in our data. Perception in Children’s Concepts. Cognitive De- Natural Morphology. Journal of Linguistics, 21(2): 321–337. piegatura piegare cedimento cedere Since we lack clear longitudinal facts of how velopment, 8:113-139. fregatura fregare spargimento spargere object categorization emerges within the chil- Frank C. Keil and Michael H. Kelly. 1987. Develop- Wolfgang U. Dressler 2005. Word-Formation in Nat- intrusione intruso rassegnazione rassegnato ural Morphology. In: Štichauer P., Lieber R. (eds) perversione perverso concitazione concitato dren, the structure view is hard to apply. We can mental changes in category structures. In S. Harnad ribellione ribelle desolazione desolato state that L3 and L4 categories appear around age (ed.), Categorical Perception: The Groundwork of Handbook of Word-Formation, Springer: 267–284. introversione introverso discrezione discreto Cognition. Cambridge: Cambridge University avversione avverso depravazione deparavato 2, but lack numbers about their overall frequency Alessandro Laudanna, Simone Gazzellini, and Maria Press. in relation to more inclusive categories. Given de Martino 2004. Representation of grammatical that the children show cognitive flexibility in Jean M. Mandler. 2000. Perceptual and Conceptual properties of Italian verbs in the mental lexicon. their categorization of an object in a particular Processes in Infancy. Journal of Cognition and Brain and Language 90: 95–105. Development, 1(1):3-36. way by producing an NN compound, the pro- Alessandro Laudanna, Anna Maria Thornton, Giorgi- cessing view conforms better to our data. To Ingmarie Mellenius. 1997. The Acquisition of Nomi- na Brown, Cristina Burani, and Lucia Marconi conclude, the children often categorize objects in nal Compounding in Swedish. [Travaux de 1995. Un corpus dell’italiano scritto contempora- a much more detailed way than adults do. l’Institut de Linguistique de Lund 31]. Lund: Lund neo dalla parte del ricevente. In: Bolasco S., Lebart University Press. L., Salem A. (eds) III Giornate internazionali di References Analisi Statistica dei Dati Testuali, Roma: Cisu: Gregory L. Murphy. 2002. The Big Book of Concepts. 103–109. Lawrence W. Barsalou. 1983. Ad Hoc Categories. Cambridge, MA: MIT Press. Memory & Cognition, 11(3):211-227. Gary Libben 1998. Semantic Transparency in the Simone P. Nguyen and Gregory L. Murphy. 2003. An Processing of Compounds: Consequences for Rep- Ruth A. Berman. 2009. Children’s Acquisition of Apple is More Than Just a Fruit: Cross- resentation, Processing, and Impairment. Brain and Classification in Children’s Concepts. Child De- Compound Constructions. In Rochelle Lieber and Language 61: 30–44. Pavol Štekauer (eds.), The Oxford Handbook of velopment, 24(6):1783-1806. Luigi Talamo and Chiara Celata 2011. Toward a mor- Compounding, 298-322. Oxford: Oxford Universi- Eleanor Rosch. 1978. Principles of Categorization. In phological analysis of the Italian lexicon: develop- ty Press. Eleanor Rosch and Barbara B. Lloyd (eds.), Cogni- ing tools for a corpus-based approach. Quaderni Marc H. Bornstein and Martha E. Arterberry. 2010. tion and Categorization, 27-48. Hillsdale, NJ: Erl- del Laboratorio di Linguistica della Scuola Nor- The Development of Object Categorizaion in baum. male Superiore di Pisa, 10. Young Children: Hierarchical Inclusiveness, Age, Perceptual Attribute, and Group versus Individual

74 111 range from mt1 to mt8, as shown in Appendix A. The priming effect of the derivatives was as- Robert Schreuder and Harald B. Baayen. 1997. How The items used in the present experiment be- sessed as the average RT difference between the Complex Simplex Words Can Be. Journal of longed to two sets of derivatives, respectively morphological condition and the identity and Memory and Language, 37:118-139. characterized by full transparency (mt1) and rela- unrelated conditions. A statistically significant Linda B. Smith. 1984. Young Children’s Understand- tive opacity (mt4). interaction between priming condition (morpho- ing of Attributes and Dimensions: A Comparison logical, identity, unrelated) and morphotactic of Conceptual and Linguistics Measures. Child 3 Experiment transparency (mt1 vs. mt4) would suggest that Development, 55(2):363-380. the morphotactic contrast is cognitively salient. 3.1 Materials and methods Sandra R. Waxman and Dana B. Markow. 1995. 3.2 Results Words as Invitations to Form Categories: Evidence Adult native Italian speakers participated in a from 12- to 13-Month-Old Infants. Cognitive Psy- speeded lexical decision task with orthographic Repeated measure ANOVAs were run with prim- chology, 29:257-302. stimuli. 32 words and 32 nonwords functioned as ing condition as within-subject factor and targets. Each target (consisting of an underived morphotactic transparency as between-subject word) was immediately preceded by a prime in factor. The mean results are shown in Table 1. three different conditions: morphological (e.g. Comparing the morphological and the unrelated ribellione/ribelle, ‘rebellion/rebel’), identity conditions, mt1 primes facilitated target recogni- (ribelle/ribelle) and unrelated (xxxxxx/ribelle). tion to a larger extent than mt4 primes. Similarly, Participants saw each target in only one of the comparing the morphological condition with the three conditions. The test items are listed in Ap- identity condition, mt4 primes slowed down tar- pendix B. get recognition to a larger extent than mt1 All primes were morphosemantically fully primes. Although the general tendency was con- transparent. Half of them were classified as mt1 sistent with the experimental hypothesis, the in- according to derIvaTario (full transparency), the teraction condition x morphotactic transparency other half as mt4 (with intervening morpho- was not significant (Pillai’s trace F=0.547, p > phonological opacifying process). The two .05). Thus, although the priming effect exerted groups were carefully balanced for: (a) average by mt4 derivatives onto the corresponding un- lexical frequencies of both primes and targets, derived words was weaker than the one yielded (b) length of prime and target (as measured by N by mt1 derivatives, the current experiment does of phonemes and N of graphemes), and (c) type not support the initial hypothesis. of base. The last point needs clarification. As is well-known, Italian morphology is not word- Table 1. Average reaction times and differential prim- based, i.e. the base does not correspond to an ing (ms) across conditions and transparency levels. actual word. Since derIvaTario assumes 7 base types, it was necessary to control for the possible identity morphological unrelated mt1 491 547 631 effect of this variable. Only the two most fre- diff. 56 84 quent base types were used in the present exper- mt4 502 573 637 iment: (i) root, i.e. an underived word without diff. 71 64 inflectional ending (e.g. bellezza ‘beauty’ as based on the root bell- of bello ‘beauti- ful.M.SG.’), (ii) verbal theme, i.e. a verb root plus 4 Discussion the thematic vowel (e.g. battimento ‘beat’ as The purpose of this experiment was to investi- based on the verbal theme batti- of battere ‘to gate whether morphotactic transparency is a cog- beat’). The se two base types were equally dis- nitively relevant factor in the processing of Ital- tributed within the two word sets: 11 verbal ian base forms when primed by corresponding themes, 5 roots. derivatives. A significant differential priming Nonwords were created by replacing one pho- effect was expected between mt1 and mt4 neme in real Italian derivatives and the corre- primes, which would have lent support to the sponding underived words. They had the same Universal Scale of Morphotactic Transparency as average length as the test words. implemented by derIvaTario. The experiment, The order of words and nonwords was ran- however, did not produce the expected result, domized across participants. Before performing despite encouraging tendencies. the task, the participants were trained on a list of A possible explanation for this result is the 8 items (4 words, 4 nonwords). strictly on-line character of the technique used

110 75 Using distributional data to explore derivational under- Morphotactic effects on the processing of Italian derivatives markedness: a study of the event/property polysemy in nominalization Pier Marco Bertinetto Chiara Celata Luigi Talamo Fabio Montermini Scuola Normale Superiore Scuola Normale Superiore Università di Bergamo CLLE-ERSS P.zza dei Cavalieri 7 P.zza dei Cavalieri 7 Piazzetta Verzieri 1 CNRS & Université de Toulouse 2 Jean Jaurès Pisa, Italy Pisa, Italy Bergamo, Italy [email protected] [email protected] [email protected] [email protected] ever, especially in the study of the semantics of Abstract derivational processes, given the pervasiveness and systematicity of such phenomena as polyse- This paper proposes a corpus-based anal- my, semantic underspecification, etc. The first morphotactic transparency. Derived forms per- ysis of deverbal suffixed nouns in Italian goal of this talk is thus to present arguments in Abstract taining to two different classes of morphotactic displaying an ambiguity between a clear favor of an usage-based model of derivational transparency but matching for length, average event reading (partenza ‘departure’) and This paper investigates the processing of morphology, i.e. an approach in which the prop- frequency, stress pattern, as well as a clear property reading (intelligenza ‘in- Italian affixed forms differing for erties of complex lexemes, and the rules by morphosemantic transparency were used as im- telligence’). It focuses, in particular on morphotactic transparency. A lexical de- which they are formed, are investigated via a mediate primes in a lexical decision task; the cor- words derived with the suffixes -nza and cision task with immediate priming was thorough observation of their real contexts of responding underived words were used as tar- -zione. Three sets of syntactic contexts used. Following the principles of use. The perspective adopted here is an exem- gets. Following the principles of morphotactic for words containing the two suffixes morphotactic transparency and Natural plar-based one, in the sense that morphological transparency and Natural Morphology, the prim- (high- and low-frequency -nza words and Morphology, the priming effect was hy- competence is considered to emerge on the basis ing effect was hypothesized to be stronger for high-frequency -zione words) were ex- pothesized to be stronger for items with a of the linguistic material speakers are exposed to, items with a higher degree of morphotactic tracted from a large corpus of contempo- higher degree of morphotactic transpar- and that this dynamics can be simulated by tak- transparency. rary Italian and coded according to their ency. However, the predictions were not ing into account large amounts of real usage da- semantic reading. The comparison of the totally met. The paper discusses possible 2 Morphotactic Transparency ta. The analysis presented can also be qualified three datasets, on the one side, confirms explanations from the theoretical and as distributional, since it is inspired, in its fun- an evolution, already observed in the lit- methodological points of view, and high- To date, the only database for morphotactic damental assumption, by distributionalist ap- erature, of -nza from a typically deverbal lights potential developments of the re- transparency of derivational processes is proaches which are current in semantics (cf. action suffix to a typically deadjectival search. derIvaTario, an open-source annotated lexicon of Lenci, 2008 for an overview), according to property suffix, and, on the other side, about 11,000 Italian derivatives which there is a correlation between a unit’s shows that the same ambiguity is ob- 1 Introduction (; see Talamo & Celata, meaning and its syntactic distribution. 2011; Talamo et al., submitted). The lexical served with -zione nouns, although, un- According to Dressler (1985, 2005), The second goal of the talk is to provide evi- source of derIvaTario is CoLFIS, Corpus e like the case of -nza, in this case it re- morphotactic transparency is one of the main dence in favor of a non-compositional view of Lessico di Frequenza dell’Italiano Scritto mains a marginal feature. The results ob- parameters within the universal markedness the- morphological derivation, according to which the (Bertinetto et al. 2005), a fully lemmatized three- tained show the interest of large-scale ory of the so-called System-Independent Mor- semantic properties of complex lexemes cannot millions word corpus of written Italian, sampled empirical observations for the analysis of phological Naturalness. It assumes the wide- be simply computed on the basis of the sub- out of a carefully balanced variety of books, morphological phenomena, and militate spread existence of “opacifying obstructions” elements they contain, but rather on the basis of journals and newspapers. CoLFIS was created in favour of a model in which (regular) (Dressler, 2005: 272) in inflectional, derivational the lexical relations they enter into. The lack of with the purpose of representing the mental lexi- polysemy should be considered as a con- or compounding processes, and is expressed by full isomorphism between the form and the con of the ideal Italian speaker – or, more exact- stituting property of derived words. preference degrees along a naturalness scale. The meaning of complex lexemes has been observed ly, reader – as reliably as possible (Laudanna et most natural forms are those without opacifying and investigated in many cases and in many lan- al., 1995). 1 Introduction obstructions, followed by those based on mildly guages. These include cases of over-marking, derIvaTario takes into account several mor- opacifying phonological processes (such as Although having a strong empirical basis is an where an element (e.g. an affix) is present with- phological properties of the base and of each af- resyllabification), while allomorphic rules and important feature of most current studies of mor- out carrying any evident meaning (cf. Roché, fix involved in the derivational cycles, crucially suppletion are the most opaque and least natural phological derivational phenomena, these are 2009, among others, for several examples in including morphotactic and morphosemantic morphological operations. In this approach, natu- often realized on (sometimes very large) series of French), and parallel cases of under-marking, transparency (see Libben, 1998 and Dressler, ral is synonymous with cognitively simple, icon- complex words taken in isolation, or on the basis where a relevant semantic differentiation lacks 2005 for the latter). With respect to the former, ic and therefore easy to acquire and process. of some examples which are intended to exem- an overt formal counterpart. The existence of the derIvaTario provides a value according to the This work investigates the native speakers’ plify the totality of the uses a derived lexeme can latter has been observed since a long time, and is Universal Scale of Morphotactic Transparency processing of Italian affixed forms differing for enter into, or at least the most common, ‘un- linked with several other phenomena which are (Dressler, 1985 and 2005). The scale values marked’, ones. This approach is reductive, how- well known in the literature on morphology and

Copyright © by Fabio Montermini. Copying permitted for private and academic purposes. Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

76 109 References letin & Review, 19:677–684. doi: 10.3758/s13423- semantics, including (regular) polysemy (Booij, 012-0251-9 2010: 78-79), multifunctionality (Luschützky (3) Andrew Duchon, Manuel Perea, Nuria Sebastián- una partenza / *latitanza / *intelligenza istantanea Gallés, Antonia Martí and Manuel Carreiras. 2013. Stanislas Dehaene, Laurent Cohen, Mariano Sigman and Rainer, 2013), morphological recycling ‘an instant departure / lam / intelligence’ EsPal: One-stop shopping for Spanish word prop- and Fabien Vinckier. 2005. The neural code for (Hathout, 2011), etc. Morphological under-

erties. Behavior Research Methods, 45:1246–1258 written words: A proposal. Trends in Cognitive marking, precisely, constitutes the main focus of una *partenza / latitanza / *intelligenza di due mesi doi:10.3758/s13428-013-0326-1 Sciences, 9:335–341. the analysis proposed here. doi:10.1016/j.tics.2005.05.004 ‘a two-month departure / lam / intelligence’ Arthur M. Jacobs, Jonathan Grainger and Ludovic 2 The data Ferrand. 1995. The incremental priming technique: una *partenza / *latitanza / intelligenza ammirevole A method for determining within-condition prim- In several languages, deverbal nouns present ‘an admirable departure / lam / intelligence’ ing effects, Perception & Psychophysics, 57:1101– several instances of systematic polysemy, some 1110. doi:10.3758/BF03208367 of which are well described in the literature (e.g. Roughly, we can distinguish the three types Francesca Peressotti, Roberto Cubelli and Remo Job. action / result, cf. Rainer, 1996, Bisetto and Mel- above according to four dimensions, as exempli- 2003. On recognizing proper names: The ortho- loni, 2007). In particular, this paper is focused on fied in Table 1. graphic cue hypothesis. Cognitive Psychology, cases of nominalization which, in spite of their 47:87– 116. doi:10.1016/S0010-0285(03)00004-5 frequency, have received less attention (but cf. action punctual bound quantifiable partenza + + + + Jeffrey S. Bowers, Gabriella Vigliocco and Richard Kerleroux, 2008 on French) namely deverbal latitanza + – + – Haan. 1998. Orthographic, phonological, and artic- nouns displaying an ambiguity between an event intelligenza – – – – ulatory contributions to masked letter and word and a property reading, as in the following ex- priming. Journal of Experimental Psychology: amples for the lexeme vigilanza in Italian: Table 1: Types of deverbal nouns. Human Perception and Performance, 24:1705– 1719. doi:10.1037/0096-1523.24.6.1705 (1) More specifically, the analysis presented has been carried on on nominalizations containing Jonathan Grainger, Arnaud Rey and Stéphane Dufau. la polizia ha effettuato una vigilanza continua 1 2008. Let ter perception: from pixels to pandemoni- ‘police guaranteed a continuous control’ the two suffixes -nza and -zione , which share um. Trends in Cognitive Sciences, 12:381-387. the property that, when they are constructed on a doi:10.1016/j.tics.2008.06.006 vs. verb, they are linked, formally and semantically, to its participle (respectively, the present and the Kate Mayall and Glyn W. Humphreys. 1996. Case mixing and the task-sensitive disruption of lexical la sua vigilanza è calata del 50% past participles) or to the homophonous adjective processing. Journal of Experimental Psychology: ‘his/her attention decreased of 50%’ (accogliere / accogliente ⇒ accoglienza ‘ac- Learning, Memory, and Cognition, 22:278–294. ceptance’; educare / educato ⇒ educazione ‘ed- doi:10.1037/0278-7393.22.2.278 Although Italian is the main focus of this paper, ucation’). In addition, they can also be construct- Kenneth I. Forster and Chris Davis. 1984. Repetition it should be observed that the same ambiguity ed on an adjective lacking a verbal counterpart priming and frequency attenuation in lexical ac- can be observed in other Romance languages (cf. frequente ⇒ frequenza ‘frequency’; perfetto cess. Journal of Experimental Psychology: Learn- (and in English), involving several cognate affix- ⇒ perfezione ‘perfection’), and in this case, base ing, Memory, & Cognition, 10:680–698. es, such as those derived from Latin -antia, -atio, adjectives more often correspond to an individu- doi:10/1037/0278 -7393.10.4.680 -mentum, -tura. In fact, this ambiguity should al-level predicate. In spite of their similarities, Manuel Perea and Eva Rosa. 2002. Does “whole word probably be ascribed to a specific property however, derived nouns in -nza and in -zione shape” play a role in visual word recognition? Per- deverbal suffixes possessed in Latin (cf. (2)), present several important differences. The most ception and Psychophysics, 64:785–794. since it is not observed with other morphological relevant one is probably the fact that while -nza doi:10.3758/BF03194745 processes which cannot be directly linked to cor- is mainly attached to stative verbs (cf. Gaeta Manuel Perea, Maria Jiménez and Pablo Gomez. responding Latin constructions, such as verb- 2002), i.e. verbs which are semantically closer to 2014. A challenging dissociation in masked identi- noun conversions or the (Germanic) deverbal (individual-level) adjectives (cf. Chierchia 1995: ty priming with the lexical decision task. Acta suffix -al in English: 177), no such tendency is observed with -zione, Psychologica , 148:130–135. which, on the contrary, seems to display a pref- doi:10.1016/j.actpsy.2014.01.014 (2) erence for active event verbs. Consequently, Marta Vergara-Martínez, Manuel Perea, Pablo Gómez Lat.: adaequatio (‘adequacy’), observantia (‘ob- apart from some exceptions (cf. partenza ‘depar- and Tamara Y. Swaab. 2013. ERP correlates of let- servation’) ture’), the property reading can be virtually ap- ter identity and letter position are modulated by plied to all -nza nouns, at least in some of their lexical-frequency. Brain and Language, 125: 11– The polysemy in question can also be linked to uses, while for -zione the situation is reversed: 27. doi: 10.1016/j.bandl.2012.12.009 the larger spectrum of meanings that have been 1 Mei-Ching Lien, Philip A. Allen and Caitlin Craw- observed for deverbal nouns; the typical event In fact, both suffixes may present several different forms in surface, whose selection depends on the form of the base ford. 2012. Electrophysiological evidence of dif- reading and the typical property reading, in fact, can be considered as the two poles of a continu- they attach to. The forms given are intended to be labels for ferent loci for case-mixing and word-frequency ef- more abstract formal representations (on the formal prob- fects in visual word recognition. Psychonomic Bul- um which includes the nominalization of more or lems posed by –nza and –zione cf., respectively, Gaeta, less permanent states (cf. Fradin, 2011, 2014): 2002: 127-129; Gaeta 2004, 346-348; Thornton, 1990; Montermini, 2010).

108 77 most of them do not allow this reading, while struction of property nouns. On the other hand, processing or whether it is also relevant in the others accept it, a behavior for which no clear no comparable shift can be observed for -zione. retrieval of lexical representations. To attain this In the N/P150, larger negative values were ob- systematicity can be identified: goal, we examined whether the effects of letter- served for lowercase than for uppercase words, 3 The analysis case (lowercase vs. UPPERCASE) are modulat- with a central scalp distribution, whereas the ef- (4) ed by word-frequency (a factor that indicates fect of word-frequency was not significant. In the In order to test the distribution of meanings a. determinazione lexical/semantic activation; see Vergara- P200, and only for low-frequency words, larger for -nza and -zione nouns, in particular along the ‘determination’ / ‘determinedness’ Martínez, Perea, Gómez, & Swaab, 2013) track- positive values were observed for the lowercase event / property divide, I extracted the 61 most ing the ERP waves in well-studied time windows than for uppercase words in frontal/central scalp frequent lexemes in -anza and -enza (the two educazione (N/P150: 100-170 ms; P200: 170-250 ms; N400: areas. With respect to the N400, the ERP waves possible formal variants)2 in the CorIs3, a large ‘education’ / ‘educatedness’ 255-450 ms) in a lexical decision task. revealed a dissociation of the letter-case effect corpus of written Italian. For each of the lexemes for low- and high-frequency words. High- in question, 100 contexts of occurrence were b. istruzione 2 Method frequency words showed an effect of letter-case randomly selected, each of which was semanti- ‘instruction’ / *‘educatedness’ in an early stage of the N400, whereas low- cally coded according to its compatibility with Twenty-two healthy, right-handed, native Span- frequency words showed an effect of letter-case one of the two meanings in question. In particu- ish-speaking Valencia University students, naïve risoluzione (in the opposite direction; see Figure 1) in a later lar, the coding was based on such properties as to the manipulation of the stimuli, participated in ‘resolution’ / *‘determinedness’ stage of the N400. the possibility of being determined by quantifica- the study in exchange for a small gift.

tion or a measure adjective, or the presence / ab- In addition to the general features described sence of temporal boundaries. We selected a set of 160 words from the Web- above, some empirical observations motivate a Figure 1 shows the distribution of meanings accessible EsPal database (Duchon, Perea, Se- deeper large-scale observation of the two deriva- according to the class of the base (verb vs. adjec- bastián-Gallés, Martí, & Carreiras, 2013). Half of tional processes in question. First, for some of tive), and, as expected, a strong correlation be- the words were of high frequency and half were the -nza nouns displaying an event reading there tween verbal bases and event reading, on the one of low frequency. The two groups of words were exists a corresponding noun containing extra side, and adjectival bases and property reading, matched in relevant psycholinguistic factors morphological material denoting a property (cf. on the other, are observed. The diagram also (length, orthographic neighborhood, concrete- assistenza ‘assistance’ assistenzialità); simi- ⇒ shows that, for the most frequent -nza nouns, the ness, imageability…). Half of the words were larly, to a past participle can correspond a de- two schemes are more or less equally available. presented in uppercase and half in lowercase rived noun denoting a property, either in concur- (MOTHER; mother). In addition, a list of 160 rence with a -zione noun or not (cf. risoluto ⇒ pseudowords (half in lowercase, half in upper- risolutezza , determinato ⇒ determinatezza (vs. case) was included for the purposes of the lexical determinazione). Second, the observation of real decision task. language use shows that lexemes with a typical event meaning can be used as property nouns, Participants were instructed to decide as accu- and vice-versa, like in the following examples rately and rapidly as possible whether or not the As expected, there was an early pre-lexical effect taken from the Web: stimulus was a Spanish word. They pressed one of letter-case that did not interact with word- of two response buttons (YES/NO). The electro- frequency. Importantly, we found an interaction (5) La produzione basata sulla concorrenza del encephalogram (EEG) was recorded from 29 between letter-case and word-frequency not only prezzo tende a tagliare i costi sostenuti dalla electrodes, averaged separately for each of the in the N400 time window –which is commonly produzione basata sulla qualità. associated to lexical-semantic processing, but Figure 1: Distribution of meanings for the most fre- experimental conditions, each of the subjects and ‘Production based on low prices (lit. price con- also the P200 time window, thus supporting the quent -nza nouns according to the class of the base. each of the electrode sites. For each time win- currence) tends to cut the costs incurred by quali- dow, we conducted ANOVAs with word- hypothesis that letter-case may affect the map- ty based production’. In order to measure the functioning of this mor- frequency (high, low), case (lowercase, UPPER- ping of visual-orthographic information onto phological process in the speakers’ synchronic CASE), and AP (anterior, central-anterior, cen- word representations. Taken together, the present Paolo […] era un uomo di estrazione nobile, di competence, the same procedure was applied to tral, central-posterior and posterior) as factors in ERP data provide empirical support to the hy- grande educazione e istruzione ed estremamente low-frequency words containing -nza in the same the design. pothesis that letter-case information may be religioso e timorato di Dio. corpus (62 lexemes overall having a frequency ≤ stored in the abstract word representations (Per- ‘St. Paul […] was a man of noble lineage, highly 3, 88 contexts overall). In this case, as shown in Results and Conclusions essotti et al., 2003), thus posing some problems educated and very cultivated (lit. of great educa- Figure 2, a strong preference of -nza for adjec- The behavioral data revealed significantly faster for current computational and neural models of tion and instruction), and extremely religious and tival bases and for the property reading can be responses for high-frequency than for low- visual-word recognition. God-fearing’. observed. frequency words (656 vs. 702 ms) and signifi- cantly faster responses for lowercase than for “Figure 1. Grand average ERP waves to Fre- Finally, as shown by Benincà and Penello quency and case manipulations in one repre- 2 uppercase words (675 vs. 683 ms). There were (2005), and as confirmed by the data I have ana- The lexemes in question range from presenza (frequency no signs of an interaction between the two fac- sentative electrode. Different columns mark the 19,671) to indifferenza (frequency 1,671). Of course, the lyzed, while nouns with a pure event reading tors. The error data revealed the same pattern as four epochs under analysis” were the predominant output for -nza in ancient least was cleared from all lexemes ending in -nza that could not be clearly linked to a synchronically existing word. the response time data. Italian, it is more employed today for the con- 3 http://corpora.dslo.unibo.it/coris_ita.html.

78 107 ERP correlates of letter-case in visual word recognition References

Barbara Leone-Fernandez Manuel Perea Marta Vergara-Martínez Paola Benincà, and Nicoletta Penello. 2005. Il suffis- Departamento de Metodo- Departamento de Metodo- Departamento de Psicología so -anza/-enza tra sincronia e diacronia. In M. logía, Facultad de Psicología logía, Facultad de Psicología Evolutiva, Facultad de Psico- Grossmann and A.M. Thornton (eds.), La forma- Av. Blasco Ibáñez, 21, Uni- Av. Blasco Ibáñez, 21, Uni- logía zione delle parole. Atti del XXXVII Congresso in- versitat de València, 46010, versitat de València, 46010, Av. Blasco Ibáñez, 21, Uni- ternazionale di studi della Società di linguistica italiana. L’Aquila, 25-27 settembre 2003, 69-86. Spain Spain versitat de València, 46010, Roma: Bulzoni. [email protected] [email protected] Spain Antonietta Bisetto, and Chiara Melloni. 2007. Result

[email protected] nominals: A lexical-semantic investigation. In G. Figure 2: Distribution of meanings for the low- Booij, B. Fradin, E. Guevara, S. Scalise and A. of different frequency (high and low). They frequency -nza nouns according to the class of the Ralli (eds.), On-line Proceedings of the Fifth Me- 1 Introduction found that the N170 amplitude, related to struc- base. diterranean Morphology Meeting (MMM5). Fré- tural encoding, was sensitive to case mixing, but jus, 15-18 September 2005. Bologna: Università di Visual word recognition is a key element of lan- the P3, related to stimulus categorization, was This result confirms the observation, mentioned Bologna. guage comprehension. The vast majority of cur- sensitive to lexicality and word frequency. They above, that in the history of Italian -nza evolved Geert Booij. 2010. Construction Morphology. Oxford rent models assume that the recognition of a proposed that case mixing affects early pro- from a (mainly) deverbal suffix forming event University Press, Oxford. printed word is based on the activation of ab- cessing stages of visual word recognition. nouns to a (mainly) deadjectival suffix forming stract letter identity representations. The hierar- Gennario Chierchia. 1995. Individual-level predicates property nouns. chical neural accounts of letter/word recognition as inherent generics. In: G.N. Carlson and F.J. Pel- The Lien et al. experiment is important, but it Moreover, in order to determine whether the letier (eds.) The Generic Book, 176-223. Chicago / of Dehaene, Cohen, Sigman, and Vinckier does not respond to the question of whether let- features identified for -nza are specific to this London: University of Chicago Press. (2005) and Grainger, Rey, and Dufau (2008) ter-case plays a role during visual-word recogni- construction or belong to deverbal nominal suf- posit that, early in the process of lexical access, Bernard Fradin. 2011. Remarks on state denoting no- tion with visually familiar words –note that fixes in general, the same procedure was further there are neuronal assemblies that respond to the minalizations. Recherches linguistiques de Vin- mIxEd-cAsE stimuli are visually unfamiliar and applied to a comparable set of words in -zione word’s case-specific letters (e.g., they respond to cennes, 40: 73-99. difficult to process. In contrast, lowercase and (including some possible formal variants, ‘e’ but not to ‘E’). Later in processing, there are uppercase words are the usual format when read- like -sione), namely the 61 most frequent forms Bernard Fradin. 2014. La variante et le double. In: F. neuronal assemblies that respond to the abstract 4 Villoing, S. David and S. Leroy (eds.) Foisonne- ing words. Indeed, experiments on visual-word in the CorIs . representation of the letter identity (e.g., they ments morphologiques. Études en hommage à recognition employ either lowercase or upper- respond to the same degree to ‘e’ and to ‘E’). Françoise Kerleroux, 109-147. Nanterre: Presses case words with no explicit justification. Universitaires de Paris Ouest. Importantly, there is one account that does as- Behavioral evidence using masked priming (i.e., sume that letter-case information may form an Livio Gaeta. 2002. Quando i verbi compaiono come a paradigm that taps onto early word processing; nomi. Un saggio di morfologia naturale. Fran- integral part of a word’s lexical representation. Forster & Davis, 1984; see Grainger, 2008, for coAngeli, Milano. Specifically, Peressotti, Cubelli, and Job (2003) review) has revealed that there is a rapid access claimed that ‘while size, font and style (cursive Nabil Hathout. 2011. Une approche typologique de la to case-invariant letter representations. Specifi- or print) affect the visual shape of letters, the up- construction des mots. In: M. Roché, G. Boyé, N. cally, the advantage of the identity condition percase–lowercase distinction is abstract in na- Hathout, S. Lignon, M. Plénat, Des unités over the unrelated condition is independent of morphologiques au lexique, 251-318. Paris: Her- ture as it is an intrinsic property of letters’ (p. the letter-case (similar advantage for kiss-KISS mès – Lavoisier. 108). In the framework of Peressotti et al.’s ‘or- and EDGE-edge; see Bowers, Vigliocco, & thographic cue’ account, a given lexical unit Françoise Kerleroux. 2008. Des noms indisctincts. In: Haan, 1998). Furthermore, response times to would not be retrieved only on the basis of the B. Fradin (ed.), La raison morphologique. Hom- matched-case identical prime-target pairs Figure 3: Distribution of meanings for -zione nouns. letter identity and letter position, but also on the mage à la mémoire de Danielle Corbin, 113-132. (EDGE-EDGE) are virtually similar as the re- Amsterdam / Philadelphia: John Benjamins. basis of letter-case information. Given that most As Figure 3 shows, the event / property polyse- sponse times to mismatched-case identical printed words are presented in lowercase, this my remains marginal for -zione, thus suggesting Alessandro Lenci. 2008. Distributional semantics in prime-target pairs (edge-EDGE; see Jacobs, should provide an advantage for the processing that, while this polysemy can be considered as a linguistic and cognitive research. A foreword. Ri- Grainger, & Ferrand, 1995; Perea, Jiménez, & of lowercase vs. uppercase words (see Mayall & constitutive property of the -nza word formation vista di Linguistica, 20.1: 1-30. Gómez, 2014). Humphreys, 1996; Perea & Rosa, 2002, for be- pattern, while it constitutes a rare and marked Hans C. Luschützky, and Franz Rainer. 2013. Instru- havioral evidence of a lowercase advantage in subpattern for -zione. ment and place nouns: A typological and diachro- To our knowledge, only a previous experiment visual-word recognition). nic perspective. Linguistics, 51.6: 1301-1359. investigated the temporal processing of letter- case using event-related potentials in an un- Fabio Montermini. 2010. The lexical representation of The main aim of this study is to examine the time masked paradigm (Lien, Allen, & Crawford, nouns and adjectives in Romance languages. Re- course of letter-case on lexical access. The ERPs 2012). Lien et al. compared the processing of cherches linguistiques de Vincennes, 39: 135-161. may help to disentangle whether letter case is an lowercase-printed vs mIxEd-cAse-printed words 4 Ranging from amministrazione (frequency 17,139) to attribute that is only relevant in early perceptual previsione (frequency 3,987).

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

106 79 Franz Rainer. 1996. La polysémie des noms abstraits: 4 Conclusions: References: historique et état de la question. In: N. Flaux, M. Glatigny, and M. Samain (eds.), Les noms Most of the available literature and Chahboun, S., Vulchanov, V., Saldaña, abstraits. Histoire et théorie. Actes du colloque de previous studies, using a range of D., Eshuis, H. & Vulchanova, M. Dunkerque (15-18 septembre 1992), 116-126. Vil- leneuve d’Ascq: Presses Universitaires du Septen- different methodologies, consistently (2015). Contextual and visual cues in trion. demonstrate that figurative language is the interpretation of idioms in high demanding for ASD populations. functioning autism. Manuscript in Michel Roché. 2009. Pour une morphologie lexicale. preparation. Mémoires de la Société de Linguistique de Paris. In particular, metaphors present a Nouvelle série, 17: 65-87. Fauconnier, G. (1985). Mental spaces. difficulty in terms of processing for the Cambridge, MA: MIT Press. Anna M. Thornton 1990/1991. Sui deverbali italiani ASD group. The preliminary results of Gold, R. & Faust, M. (2010). Right in -mento e -zione. Archivio glottologico italiano, this study confirm our earlier findings LXXV (2): 169-207 / LXXVI (1): 79-102. Hemisphere Dysfunction and that the auditory modality is more Metaphor Comprehension in Young demanding for the ASD group. Adults with . Surprisingly, the significant effect we Journal of Autism and found for accuracy was confined to the Developmental Disorders, 40, 7, conventional metaphors. An 800-811. explanation can be sought in the Lakoff, G., & Johnson, M. (1980). difference between conventional Metaphors we live by. Chicago: and novel metaphors. University of Chicago Press. Conventional metaphors are less Micai, M., Vulchanova, M., & Saldaña, transparent, making them more D. (2015). Can targeted instructions problematic compared to novel change the reading strategy of metaphors, as these might be processed children with ASD? An eye link without the need for prior familiarity. study. Manuscript in preparation. Vulchanova, M., Talcott, J., These results support the findings in Vulchanov, V., and Stankova, M. Chahboun et al (2015), where a similar (2012). Language against the odds, effect was found for idioms contra or rather not: The Weak Central novel metaphors. Idioms are similar to Coherence hypothesis and language. conventional metaphors in that both Journal of Neurolinguistics, 25, types of expression are less transparent 1,13-30. than both literal expressions and novel Vulchanova, M., Saldaña, D., metaphors. Chahboun, S., & Vulchanov, V. Acknowledgments: (2015): Figurative language processing in atypical populations: This project has received funding from The ASD perspective. Frontiers in the European Union’s Seventh Human Neuroscience, special issue Framework Programme for research; The Metaphorical Brain. technological development and doi:10.3389/fnhum.2015.00024. demonstration under grant agreement no 316748.

80 105 A Distributional Semantics Approach to Implicit Language Learning

Dimitrios Alikaniotis John N. Williams 3 Preliminary results: Department of Theoretical and Applied Linguistics University of Cambridge The data of both the control and experimental group (N=19) were 9 West Road, Cambridge CB3 9DP, United Kingdom da352|jnw12 @cam.ac.uk analysed with R. A linear mixed model { } analysis on RTs revealed a significant interaction between presentation modality and conventionality of the 1 Introduction to test for generalisation of the hidden regular- metaphors (p<.05), with poorer ity. W and L&W report such a generalisation ef- performance of the ASD group when Vector-space models of semantics (VSMs) derive fect even in participants who remained unaware the prime was presented auditorily. word representations by keeping track of the co- of the relevance of animacy to article usage – se- Furthermore, there was an interaction occurrence patterns of each word when found in mantic implicit learning. Paciorek and Williams Figure 1: Examples of the targets between group and age, with younger large linguistic corpora. By exploiting the fact that (2015) (P&W) report similar effects for a sys- semantically related used: Literal or groups taking more time to respond. similar words tend to appear in similar contexts tem in which novel verbs (rather than determiners) metaphorical relation Finally, the results showed a significant (Harris, 1954), such models have been very suc- collocate with either abstract or concrete nouns. interaction between modality, type of cessful in tasks of semantic relatedness (Landauer However, certain semantic constraints on seman- 2.3 Procedure: target and age. The younger groups’ and Dumais, 1997; Rohde et al., 2006). A com- tic implicit learning have been obtained. In P&W performance was slower when the mon criticism addressed towards such models is generalisation was weaker when tested with items Each participant was tested individually that were of relatively low semantic similarity to in a single session. Participants either prime was presented auditorily, and that those co-occurrence patterns do not explicitly encode specific semantic features unlike more tra- the exemplars received in training. In L&W Chi- saw the prime expression on a computer when the target relationship with the ditional models of semantic memory (Collins and nese participants showed implicit generalisation screen or heard it via loud-speakers. prime was figurative. Quillian, 1969; Rogers and McClelland, 2004). of a system in which determiner usage was gov- The timing of the specific stimulus Regarding accuracy, with a generalized Recently, however, corpus studies (Bresnan and erned by whether the noun referred to a long or events on each trial was as follows: (1) linear mixed model (R) we found Hay, 2008; Hill et al., 2013b) have shown that flat object (corresponding to the Chinese classifier The prime is presented as visual text on significant interactions depending on some ‘core’ conceptual distinctions such as ani- system) whereas there was no such implicit gen- the screen or auditorily via the loud- the modality of the prime. The ASD macy and concreteness are reflected in the distri- eralisation in native English speakers. Based on speakers (depending on the groups were less accurate in the butional patterns of words and can be captured by this evidence we argue that the implicit learnabil- experimental block ); (2) a fixation auditory modality, in contrast with the control groups. Moreover, the results such models (Hill et al., 2013a). ity of semantic regularities depends on the degree point is presented followed by a delay to which the relevant concept is reflected in lan- show a significant interaction between In the present paper we argue that distributional of 400 ms as a latency; (3) a target is guage use. By forming semantic representations conventional metaphors and age in both characteristics of words are particularly important presented as word or non-word; (4) of words based on their distributional character- groups. There was a significant when considering concept availability under im- Finally, participants have to decide istics we may be able to predict what would be interaction between the type of target, plicit language learning conditions. Studies on im- whether the target is a word or not in learnable under implicit learning conditions. Spanish (cf. Figure 2.). modality and age. Finally, a main effect plicit learning of form-meaning connections have of group, a main effect of age and an highlighted that during the learning process a re- 2 Simulation interaction of age and group were stricted set of conceptual distinctions are available observed The typically developing such as those involving animacy and concreteness. We obtained semantic representations using the participants were more accurate in both For example, in studies by Williams(2005) (W) skip-gram architecture (Mikolov et al., 2013) age ranges. In both the experimental and Leung and Williams(2014) (L&W) the partic- provided by the word2vec package,1 trained and the control group, the older ipants were introduced to four novel determiner- with hierarchical softmax on the British National participants performed better than the like words: gi, ro, ul, and ne. They were explic- Corpus or on a Chinese Wikipedia dump file of younger ones, and the difference in itly told that they functioned like the article ‘the’ comparable size. The parameters used were as fol- performance between the age ranges in but that gi and ro were used with near objects lows: window size: B5A5, vector dimensionality: and ro and ne with far objects. What they were 3 the ASD group was greater than in the 300, subsampling threshold: t = e− only for the control group. not told was that gi and ul were used with living English corpus. things and ro and ne with non-living things. Par- The skip-gram model encapsulates the idea ticipants were exposed to grammatical determiner- of distributional semantics introduced above by Figure 2: Sequence of events for the noun combinations in a training task and after- trials of the experiment. wards given novel determiner-noun combinations 1https://code.google.com/p/word2vec/

Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

104 81 1.0 Williams (2005) 0.35 Paciorek & Williams (2015), exp. 1 say, table. Here we exploit priming to modality of presentation of the stimuli reveal how metaphorical expressions (auditory vs. written) has an effect on 0.8 are associated with figurative as their processing, as already established 0.6 opposed to literal interpretations in in on-going research (Chahboun, ation ation individuals with ASD. We are also Vulchanov, Saldaña & Vulchanova, v v 0.25 i i Act Act 0.4 interested in whether or not the 2015). Abstract Grammatical Abstract Ungrammatical 0.2 Grammatical Concrete Grammatical Ungrammatical Concrete Ungrammatical 2 Method: Responses were collected with a 0.0 0.15 response box; response accuracy (ACC) 0 5 10 15 20 25 0 5 10 15 20 2.1 Participants: Epoch Epoch and reaction times (RTs) were 1.00 Two age groups of high-functioning measured by the E-prime software. The Figure 1: Generalisation gradients obtained from Grammatical stimuli included 36 prime expressions the Williams (2005) dataset. The gradients were Ungrammatical ASD participants (N=48) and controls (N=39) were included (all native classified into 3 different types: novel obtained by averaging the output activations for 0.75 speakers of Spanish),each group has 2 metaphors, conventional metaphors and the grammatical and the ungrammatical pairs, re- es rat age ranges free combinations (non- metaphorical

spectively. The network hyperparameters used ent 0.50 expressions), all comprising a noun and were: learning rate: η = 0.01, weight decay:  Group 1: Age range 10-12. 100 Control group (N=18) and ASD group a modifier. The target words were γ = 0.01, size of hidden layer: h R . For this Endorsem ∈ semantically related to the prime and all the reported simulations the dashed verti- (N=26). 0.25 expressions. On half of the instances for cal lines mark the epoch in which the training error  Group 2: Age range 16-20. approached zero. See text for more information on Control group (N=21) and ASD group each group of expressions, targets were related to the figurative interpretation of the experiment. 0.00 (N=22). Abstract Concrete the prime, the remaining half were Figure 2: Results of our simulation along with Participants and their legal tutors related to the literal meaning (cf. (usually the parents) provided written learning which contexts are more probable for a the behavioural results of Paciorek and Williams Figure1.) consent for entry into the study. Most of given word. Concretely, it uses a neural network (2015), exp. 1. The hyperparameters used were In a pilot study with 150 adult native architecture, where each word from a large cor- the same as in the simulation of Williams (2005). the individuals had participated in an earlier study (Chahboun et al 2015). speakers of Spanish, we determined the pus is presented in the input layer and its context degree of familiarity of the metaphors. (i.e. several words around it) in the output layer. The diagnosis of ASD was confirmed classifier (a feedforward neural network) the task This allowed us to verify the The goal of the network is to learn a configuration according to the Autism Diagnostic of which was to learn to associate noun represen- conventionality of the metaphors or of weights such that when a word is presented in Observation Schedule (ADOS) and also their novelty, and their inclusion in the the input layer the nodes in the output that become tations to determiners or verbs, depending on the study in question. During the training phase, the with the Autism Quotient (AQ). test stimuli. The same number of filler more activated correspond to those words in the expressions (N=36) were added, vocabulary, which had appeared more frequently neural network received as input the semantic vec- We also made sure the participants do respectively as primes, and non-words as its context. tors of the nouns and the corresponding determin- not have any structural language deficit. ers/verbs (coded as 1-in-N binary vectors, where served as targets. Thus, each participant As argued above, the resulting representations In addition to measuring the general IQ N is the number of novel non-words)2 in the out- responded in total to 72 trials, 36 in each will carry, by means of their distributional pat- with the Weschler Scale (WISC IV or put vector. Using backpropagation with stochas- modality: visual modality (stimuli terns, semantic information such as concreteness WAIS) we measured the participants’ tic gradient descent as the learning algorithm, the presented orthographically) and or animacy. Consistent with the above hypothe- receptive vocabulary (British Picture goal of the network was to learn to discriminate auditory (stimuli presented auditorily). ses, we predict that given a set of words in the Vocabulary Scale), their grammatical between grammatical and ungrammatical noun – The experiment was designed as a training phase, the degree to which one can gen- language level (CEG: Test of determiner/verb combinations. We hypothesise lexical decision task on the target word. eralise to novel nouns will depend on how much comprehension of grammatical that this could be possible if either specific fea- the relevant concepts are reflected in the former structures) and theory of mind. tures of the input representation or a combination words. If, for example, the words used during the of them contained the relevant concepts. Consid- 2.2 Apparatus and Stimuli: training session do not encode animacy based on ering the distributed nature of our semantic repre- their co-occurrence statistics, albeit denoting an- Stimuli were displayed on a color sentations, we explore the latter option by adding imate nouns, then generalising to other animate a tanh hidden layer, the purpose of which was to monitor controlled by E-prime software nouns would be more difficult. extract non-linear combinations of features of the implemented on a Dell compatible In order to examine this prediction, we fed the laptop. resulting semantic representations to a non-linear 2All the studies reported use four novel non-words.

82 103 0.35 Paciorek & Williams (2015), exp. 4 1.0 Leung & Williams (2014), exp. 3 Metaphorical priming in a lexical decision task in high functioning autism 0.8

0.6 1 1 2 1 1 ation ation

Chahboun , S., Vulchanov , V., Saldaña , D., Eshuis, R. and Vulchanova , M. v 0.25 v i i

Act Act 1 0.4 Language Acquisition and Language Processing Lab, NTNU, Norway. Abstract Grammatical Abstract Ungrammatical 2 0.2 Departamento de Psicología Evolutiva, University of Seville, Spain. Concrete Grammatical Chinese Grammatical Concrete Ungrammatical English Grammatical 0.15 0.0 0 5 10 15 20 0 250 500 Epoch Epoch 1 Introduction: we need to be able to “decode” the 1.00 1300 Grammatical Grammatical intentions and ideas of person to whom 1204 The difficulties experienced by autistic we are talking. The findings from this Ungrammatical Ungrammatical individuals with regard to study showed that metaphor 0.75 communication and language are 1200

comprehension is impaired in es

widely known and well documented. rat

s)

individuals with autism. ent Individuals with High functioning 0.50 (m T autism (ASD) are distinguished by Our hypothesis in this study is that this R relative preservation of linguistic and deficient metaphorical ability might Endorsem 1100 cognitive skills. However, problems depend, not only in the type of 0.25 with pragmatic language skills have figurative expression (regarding the been consistently reported across the novelty or conventionality of it), but 0.00 Abstract Concrete 1000 autistic spectrum, even when structural also on the way these expressions are English Chinese language is intact. Many studies perceived. This is especially relevant Figure 3: Results of our simulation along with Figure 4: Results from Leung and Williams establish failure to understand for individuals with ASD who need the behavioural results of Paciorek and Williams (2014), exp. 3. See text for more info on the mea- metaphors, idioms and other forms of specific ways of integrating inputs, such (2015), exp. 4. The hyperparameters used were sures used. The gradients for the ungrammatical figurative language (Gold & Faust, as the ways in which the type of the same as in the simulation of Williams (2005). combinations are (1 grammatical). The value of − 2010; Vulchanova, Talcott, Vulchanov instruction can drastically change the the weight decay was set to γ = 0.05 while the & Stankova, 2012). Figurative language reading comprehension in this rest of the hyperparameters used were the same as input vector. We then recorded the generalisation takes many forms, conceptual population (Micai, Vulchanova & in the simulation of Williams (2005). metaphors being one of the most Saldaña 2015). In the current study, we ability through time (epochs) of our classifier by common. On the cognitive level, test responses to metaphorical simply asking what would be the probability of k conceptual metaphors are the mental expressions and whether or not encountering a known determiner with a novel word ~w by taking the softmax function: If the model has been successful in learning that representations we establish in order to metaphors solicit priming for literal or ‘gi’ should be activated more given animate con- map between two domains (Lakoff & rather the appropriate figurative cepts then the probability P (y = gi ~w ) would | lion Johnson 1980; Fauconnier 1985; interpretation in high-functioning exp (netk) P (y = ~w ) p(y = k ~w) = . (1) be higher than ro lion . Fig. 1 shows the Vulchanova, Saldaña, Chahboun & children and adolescents with ASD. | exp (net ) | k0 K k0 performance of the classifier on the testing set of Vulchanov 2015). In other words, the ∈ W where, in the behavioural data, selection of the These tests are carried out through a P logic of one conceptual domain is 3 Results and Discussion grammatical item was significantly above chance cross modal priming task. Priming is a applied to another. in a two alternative forced choice task for the un- process occurring outside conscious Figures 1-4 show the results of the simulations aware group. The slopes of the gradients clearly Several studies have shown impaired awareness, and thus differs from direct across four different datasets which reflect differ- ent semantic manipulations. The simulations show show that on such a task the model would favour figurative language in ASD populations. retrieval. It is an effect of retrieval from grammatical combinations as well. One of the first studies in figurative implicit memory, creating a heightened the generalisation gradients obtained by applying language in autism for instance was that sensitivity to certain stimuli. In general, eq. (1) to every word in the generalisation set and Figures 2-3 plot the results of two experiments of Happé (1995). She used 3 types of priming effects are found between then keeping track of the activation of the different from P&W which focused on the abstract/concrete expressions: synonyms, similes, and lexical items which share a semantic determiners (W, L&W) or verbs (P&W) through distinction. P&W used a false memory task in the metaphors. The underlying assumption component or a semantic association. time. For example, in W where the semantic dis- generalisation phase, measuring learning by com- of this study is that, in order to For example, angel is recognized tinction was between animate and inanimate con- paring the endorsement rates between novel gram- cepts ‘gi lion’ would be considered a grammatical matical and novel ungrammatical verb-noun pairs. understand these kinds of expressions, quicker, if it is followed by wings than, sequence while ‘ro lion’ an ungrammatical one. It was reasoned that if the participants had some

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

102 83 knowledge of the system they would endorse more References potential study. Memory & Cognition, 30(6), novel grammatical sequences. Expt 1 (Fig. 2) used Bresnan, J. and Hay, J. (2008). Gradient grammar: Reference 958–968. generalisation items that were higher in seman- An effect of animacy on the syntax of give in tic similarity to trained items than was the case in Siyanova-Chanturia, A., Conklin, K., & Schmitt, Lai, V. T., Curran, T., & Menn, L. (2009). Com- New Zealand and American English. Lingua, Expt 4 (Fig. 3). The behavioural results from the N. (2011). Adding more fuel to the fire: An prehending conventional and novel metaphors: 118(2):245–259. unaware groups (bottom rows) show that this ma- eye-tracking study of idiom processing by na- An ERP study. Brain Research, 1284, 145– nipulation resulted in larger grammaticality effects Collins, A. M. and Quillian, M. R. (1969). Re- tive and non-native speakers. Second Lan- 155. guage Research, 27(2), 251–272. on familiarity judgements in Expt 1 than Expt 4, trieval time from semantic memory. Jour- Regel, S., Gunter, T. C., & Friederici, A. D. and also higher endorsements for concrete items nal of Verbal Learning and Verbal Behavior, Cacciari, C., & Tabossi, P. (1988). The compre- (2010). Isn’t It Ironic? An Electrophysiologi- in general in Expt 1. Our simulation was able to 8(2):240–247. hension of idioms. Journal of Memory and cal Exploration of Figurative Language Proc- capture both of these effects. Harris, Z. (1954). Distributional structure. Word, Language, 27(6), 668–683. essing. Journal of Cognitive Neuroscience, 23(2), 277–293. L&W Expt 3 examined the learnability of a sys- 10(23):146–162. Zempleni, M.-Z., Haverkort, M., Renken, R., & tem based on a long/flat distinction, which is re- Hill, F., Kiela, D., and Korhonen, A. (2013a). A. Stowe, L. (2007). Evidence for bilateral in- Berkum, J. J. A. V., Holleman, B., Nieuwland, flected in the distributional patterns of Chinese but Concreteness and Corpora: A Theoretical and volvement in idiom comprehension: An fMRI M., Otten, M., & Murre, J. (2009). Right or not of English. In Chinese, nouns denoting long Wrong? The Brain’s Fast Response to Morally Practical Analysis. In Proceedings of the Work- study. NeuroImage, 34(3), 1280–1291. objects have to be preceded by a specific classi- Objectionable Statements. Psychological Sci- shop on Cognitive Modeling and Computa- Lauro, L. J. R., Tettamanti, M., Cappa, S. F., & fier while flat object nouns by another. L&W’s ence, 20(9), 1092–1099. tional Linguistics, pages 75–83. Papagno, C. (2008). Idiom Comprehension: A training phase consisted of showing to participants Prefrontal Task? Cerebral Cortex, 18(1), 162– Molinaro, N., Carreiras, M., & Duñabeitia, J. A. Hill, F., Korhonen, A., and Bentz, C. (2013b). combinations of thin/flat objects with novel deter- 170. (2012). Semantic combinatorial processing of A Quantitative Empirical Analysis of the Ab- miners, asking them to judge whether the noun non-anomalous expressions. NeuroImage, Boulenger, V., Hauk, O., & Pulvermüller, F. was thin or flat. After a period of exposure, partic- stract/Concrete Distinction. Cognitive Science, 59(4), 3488–3501. (2009). Grasping Ideas with the Motor Sys- ipants were introduced to novel determiner – noun 38(1):162–177. tem: Semantic Somatotopy in Idiom Compre- Donchin, E., & Coles, M. G. H. (1988). Is the combinations, which either followed the grammat- Landauer, T. K. and Dumais, S. T. (1997). A so- hension. Cerebral Cortex, 19(8), 1905–1914. P300 component a manifestation of context ical system (control trials) or did not (violation tri- lution to Plato’s problem: The latent semantic updating? Behavioral and Brain Sciences, Rommers, J., Dijkstra, T., & Bastiaansen, M. als). Participants had significantly lower reaction analysis theory of acquisition, induction, and 11(03), 357–374. times (Fig. 4, bottom row) when presented with a representation of knowledge. Psychological Re- (2012). Context-dependent Semantic Process- novel grammatical sequence than an ungrammat- view, 104(2):211–240. ing in the Human Brain: Evidence from Idiom Verleger, R. (1988).Event-related potentials and cognition: A critique of the context-updating ical sequence, an effect not observed in the RTs Comprehension. Journal of Cognitive Neuro- Leung, J. H. C. and Williams, J. N. (2014). science, 25(5), 762–776. hypothesis and an alternative interpretation of of the English participants. The corresponding re- Crosslinguistic Differences in Implicit Lan- the P300.Behavioral and Brain Sciences,11, Libben, M. R., & Titone, D. A. (2008). The sults of our simulations plotted in Fig. 4 show that guage Learning. Studies in Second Language 343–427. multidetermined nature of idiom processing. indeed the regularity was learnable when the se- Acquisition, 36(4):733–755. mantic model had only experienced a Chinese text, Memory & Cognition, 36(6),1103–1121. Vespignani, F., Canal, P., Molinaro, N., Fonda, Mikolov, T., Sutskever, I., Chen, K., Corrado, S., & Cacciari, C. (2009). Predictive Mecha- but not when it experienced the English corpus. Luck, S. J. (2014). An Introduction to the Event- G. S., and Dean, J. (2013). Distributed repre- nisms in Idiom Comprehension. Journal of While more direct evidence is needed to support Related Potential Technique. MIT Press. Cognitive Neuroscience, 22(8), 1682–1700. our initial hypothesis, our results seem to point sentations of words and phrases and their com- Advances in Neural Informa- Hoeks, J. C. J. and Brouwer, H. (2014). to the direction that semantic information encoded positionality. In Boulenger, V., Shtyrov, Y., & Pulvermüller, F. Electrophysiological Research on tion Processing Systems, pages 3111–3119. (2012). When do you grasp the idea? MEG by the distributional characteristics of words when Conversation and Discourse Processing. In: Paciorek, A. and Williams, J. (2015). Seman- evidence for instantaneous idiom understand- found in large corpora can be important in deter- Holtgraves, T. (Ed.), Oxford Handbook of ing. NeuroImage, 59(4), 3502–3513. mining what could be implicitly learnable. tic generalisation in implicit language learning. Language and Social Psychology, pp. 365- Journal of Experimental Psychology: Learning, 386. New York: Oxford University Press. Memory and Cognition. Hagoort, P., & Berkum, J. van. (2007). Beyond Rogers, T. T. and McClelland, J. L. (2004). Se- the sentence given. Philosophical Transac- mantic Cognition: A Parallel Distributed Pro- tions of the Royal Society B: Biological Sci- cessing Approach. MIT Press. ences, 362(1481), 801–811. Rohde, D., Gonnerman, L. M., and Plaut, D. C. Federmeier, K. D. (2007). Thinking ahead: The (2006). An improved model of semantic simi- role and roots of prediction in language com- larity based on lexical co-occurrence. Commu- prehension. Psychophysiology, 44(4), 491– nications of the ACM. 505. Williams, J. N. (2005). Learning without aware- Coulson, S., & Petten, C. V. (2002). Conceptual ness. Studies in Second Language Acquisition, integration and metaphor: An event-related 27:269–304.

84 101 expression (e.g., ice). Concerning the second experimental question Suffixation and the expression of time and space in Modern Greek related to the composition of individual constitu-

ent words we argue that Experiment 2 showed Experiment 1 showed that: that the literal meaning of the last word of the Anna Anastassiadis-Symeonidis - No N400 differences emerged between expression was at least accessed, and confirms Aristotle University of Thessaloniki literal and idiomatic context, during the other evidence supporting the idea that readers [email protected] processing of the three constituent words. process the literal meaning of idiomatic constitu- - Differences between Idiomatic vs. Literal, ents (Boulenger, Shtyrov & Pulvermüller, 2012). and Idiomatic vs. Control conditions Moreover, the lack of N400 differences across emerged during the presentation of the last conditions and word positions, suggests that lexi- word of the expression (e.g., ice), and oc- cal retrieval processes similarly occurred in curred in the 400 to 600 ms time interval. literal and idiomatic contexts. However, the - Consistently with Rommers et al (2013) analysis of the frequency domain replicated study, the Time-Frequency analysis of the Rommers et al’s findings of a larger power in- EEG revealed power differences in the e.g., proinos ‘of early morning’, vradinos ‘of the crease in the high gamma frequency band for higher gamma frequency band (60-80Hz) Abstract evening’, kalokairinos ‘of the summer’, pasha- literal compared to idiomatic contexts, which, linos ‘of Easter’, aprilianos ‘of April’, simerinos between expressions embedded in literal This paper draws a comparison, through consistently with their interpretation, could sig- ‘today’s/of today’, pantotinos ‘of ever - everlast- vs. idiomatic contexts: no power increase semasiological and onomasiological nal that word-by-word composition mechanisms ing’ - vorinos ‘north’, antikrynos ‘of the opposite was associated with the idiomatic condi- methods, of three Modern Greek (MG) are less engaged in idioms comprehension. side’, brostinos ‘of the front’, makrinos ‘distant’. tion. suffixes -in(os), -iatik(os) and -isi(os), The temporal sense base-nouns can label which construct denominal adjectives of Conclusions one of the denominations of the internal structure time and/or space. Following D. Corbin’s of the time unit YEAR, e.g., kalokairi ‘summer’, Experiment 2 showed that: When presented with idiomatic expressions model (1987; 1991 and forthcoming) of theros ‘summer’, fthinoporo ‘autumn’, or DAY, - Target words related to the literal meaning readers retrieve the literal meaning of the con- Construction Morphology, an in depth e.g., proi ‘morning’, vradi ‘evening’, or desig- of the idiomatic constituents obtained stituent words. However, word-by-word seman- analysis of these suffixes’ semantics will nate one of their special denominations, e.g., faster lexical decision times with respect tic composition mechanisms are idling, and, only be presented. The results suggest that, in Aprilios ‘April’. Aside from these base-nouns, to unrelated targets, regardless of type of at the end of the expression, a seman- order to construct a denominal adjective we observe that the base can be selected from the context. tic/pragmatic wrap-up of the idiom is carried out following the relational Lexeme Con- names of important celebrations e.g., Pasha to update the sentence representation. struction Rule (LCR ), a categorical, REL ‘Easter’, and that the specific deictic (NOW) de- 5 Discussion semantic and pragmatic compatibility are nominations construct denominal adjectives ex- necessary between the base-noun and the Concerning the question related to how the clusively with the suffix -in(os), e.g., simerinos suffix, as well as between the suffixed meaning of the whole idiom is integrated in the ‘of today’, apopsinos ‘of this evening’, htesinos adjective and the noun of the noun phrase sentence representation, our results suggest that ‘of yesterday’, torinos ‘of now’, fetinos ‘of this (NP); there are no synonyms even if the integration mechanisms occur only upon presen- year’, persinos ‘of last year’, pantotinos ‘of ever same noun is used as a base-noun. The tation of the last constituent word, when the - everlasting’. three suffixes differ with respect to their idiomatic expression has very likely been recog- Following our observation of spatial sense semantic and pragmatic features; as a nized. On the last constituent, ERP differences base-nouns we operate a distinction between: (i) consequence, they are used in different between idiomatic and literal contexts emerged a group of nouns referring to geographical terms, genres. The data has been drawn from between 400 and 600 ms in frontal electrodes. e.g., vorras ‘north’, oros ‘mountain’, thalassa many dictionaries and especially from the The timing and scalp distribution of the effect ‘sea’; (ii) toponyms, e.g., Alexandria ‘Alexan- Reverse Dictionary of Modern Greek suggest t hat it affected a positive component (the dria’; and (iii) adverbs constructing denomina- (Anastassiadis-Symeonidis, 2002) as well frontal Post-N400 Positivity) occurring soon af- tions within the deictic system (HERE), e.g., an- as the Corpus of Greek Texts (Goutsos ter the peak of the N400. These results could be tikry ‘across’, konta ‘near’, makria ‘far’, piso 2003). accommodated elaborating the framework pro- ‘behind’. posed by the Retrieval-Integration hypothesis 1 The suffixes Finally, based on the context, the remaining [Hoecks & Brower, 2014], which holds that se- nouns in the corpus (13%) can be categorized as mantic - pragmatic integration processes are re- 1.1 The -in(os) suffix conveying spatial meaning (provenance), e.g., flected in P600 like positivities. One possible agheladhino ghala ‘cow’s milk’, vodhi- This suffix is applied to a nominal base, or interpretation is that the observed frontal positive no/hoirino kreas ‘bovine (beef)/pig (pork) meat’, an adverbial one which could, however, be con- shift might be part of a larger family of positive kreatini/tyrini evdhomadha ‘Meatfare/Cheesefare sidered as a nominal one, given that these ad- components reflecting the engagement of a se- week’, anthropini symperifora ‘human behav- verbs function also as nouns (Berthonneau 1989: mantic/pragmatic wrap-up mechanism that is iour’. The same principles hold for the adjectives 493). Consequently, we suggest a unified nomi- performed at end of the expression to assign a foteinos ‘bright’, faeinos ‘brilliant’, skoteinos nal base. In our corpus’ base-nouns (87%) be- full interpretation to the incoming input. ‘dark’, alithinos ‘real’, that originate in ancient long to the category of temporal or spatial nouns, Greek, where the base-noun functioned as a spa-

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage . Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

100 85 tial noun; relevant passages are preserved where païdakia ‘lamb cutlets’, ghidhisio ghala ‘goat al., 2010] processing, or semantic pragmatic 1a) La maestra aveva notato che Nicola dis- the nouns fos ‘light’ and skotos ‘darkness’ refer milk’ , katsikisio tyri ‘ goat cheese’, to a plant, reanalysis (frontal Post-N400 Positivity) [e.g., turbava i compagni, ma la prima volta chiuse to the source that transmits light and darkness e.g., kalampokisio alevri ‘corn flour’, thymarisio Van Berkum et al., 2009; Molinaro et al., 2012]. un occhio e continuò la lezione. respectively (Giannakis, 2001). Similarly, alithi- meli ‘thyme honey’, to an artefact, e.g., varelisia Another result that has been previously reported (The teacher saw Nick was bothering his desk mate nos ‘real’ refers to location, since –according to bira ‘draught’, to a human or human-like being in the ERP literature of idioms processing is the but for the first time she closed an eye (turned a Plato– truth originates from the real world. (human entity) or to parts of the human body, finding of an involvement of the P300 compo- blind eye) and kept on teaching.) through extension, e.g., flevisio aima ‘veins’ nent. The P300 is generally associated with cog-

1.2 The -iatik(os) suffix blood’ or through an intension reading, related to nitive mechanisms of context update [Donchin & 1b) Alla visita oculistica Enrico, prima di leg- gere le lettere indicate sulla lavagna lumi- From a semantic point of view, we notice a stereotypical meaning, e.g., gherontisia foni Coles, 1988] or context closure [Verleger, 1988]: that approximately 85% of the corpus consists in ‘elderly’s voice’. Vespignani et al. (2010) found that the brain’s nosa, chiuse un occhio per valutare la mio- bases which are temporal nouns referring to The availability of the suffix -isi(os) in con- electrical response to the correct idiom constitu- pia. time-measure units, e.g., hronos ‘year’ minas temporary language use is rather restricted, as it ent was different if recorded before or after the (At the Ophthalmological visit, before starting to is not encountered in cases where it is possible to idiom recognition point (RP, e.g., prendere il read the letters on the panel aloud Henry closed an ‘month’ (e)vdhomadha ‘week’ as well as their eye in order to evaluate his nearsightedness.) construct non-attested lexemes which constitute toro per le … corna -- take the bull by the … reanalyses, including two subsets: (i) denomina- RP RP nothing more than coincidental gaps (Corbin, horns). The match to the correct idiom word was tions of special units, e.g., Dheftera ‘Monday’, 1c) Giovanni ha rotto gli occhiali durante la 1987: 177). associated with an N400 reduction before recog- Triti ‘Tuesday’, Ianouarios ‘January’, Fe- rissa perché ha preso un pugno in un occhio e vrouarios ‘February’; and (ii) denominations nition, but the electrophysiological response led 2 Is there synonymy? gli sono caduti a terra. related to the internal structure of the above to a P300 effect after the recognition of the id- (Jack broke his glasses during the fight because got iom. Such effect would mirror a qualitative units, e.g., proï ‘morning’, mesimeri ‘midday’, We argued that the -in(os) suffix constructs a punch in his eye and fell on the ground.) change in readers’ expectations about upcoming anoiksi ‘spring’ (Berthonneau, 1989). denominal adjectives related to space and time, words, after the expression has been recognized. In addition, the base can be selected among that the -iatik(os) suffix constructs denominal We also expected to replicate Rommers et al 3.3 Procedure important days of public holidays or religious adjectives related to time and that the -isi(os) (2013) results in the time-frequency domain of celebrations with which people mark time, and suffix constructs denominal adjectives of prove- In Experiment 1 sentences were presented word- the EEG. The authors observed a power increase which are therefore categorized as temporal nance, related to the notion of space. The ques- by-word at the centre of the screen in the upper gamma frequency band after the nouns, e.g., Protomaghia ‘First of May’, Prota- tion will thus be the following: can we talk about (SOA=600ms). In Experiment 2, contexts sen- presentation of the expected target words in lit- prilia ‘First of April’, Protochronia ‘New Year’s synonymy between the temporal and spatial de- tences were auditorily presented via headphones eral but not in idiomatic contexts, supporting the Day’, Pasha ‘Easter’, Hristoughenna ‘Christ- nominal adjectives constructed with the afore- until the last word of the expression. Targets that hypothesis that semantic unification mechanisms mas’, Aghio-Vasilis ‘the feast day of Saint Vasil- mentioned suffixes and the same base-noun? could be related or unrelated to the literal mean- are less engaged in idioms comprehension. ios’, Aï-Dhimitris ‘the feast day of Saint Deme- If we take into account the pragmatic feature ing of the last word of the expression were visu- trios’, Kathari Dheftera ‘Clean/Ash Monday’, [learned], a feature with a non-binary value (An- 3 Method ally presented at the offset of the audio file. apokria ‘Carnival festivities’, paramoni ‘Eve’. astassiadis-Symeonidis and Fliatouras, 2004), we Finally, the suffix -iatik(os) is attached to the notice that for the base-nouns with a [+learned] 3.1 Participants 4 Results base form of 7 nouns, seemingly not associated value, only the -in(os) suffix is applied, that for 380 students at Università degli studi di Modena Fig.1 Grand Average ERPs from a pool of 7 fron- with a temporal sense: paidh(i) ‘child’ , ghiort(i) the base-nouns with a [-learned] value only the e Reggio Emilia participated to the study set up tal electrodes (AF3, AF4, F3, FZ, F4, FC1, FC2) in ‘celebration’, skol(i) ‘leisure’, feggar(i) ‘moon’, suffixes -iatik(os) and -isi(os) are applied, and, to norm the experimental materials. 32 different which frontal PNP effects are usually reported ghampr(os) ‘groom’, nyf(i) ‘bride’, kefal(i) that for the base-nouns with a [+/-learned] value ‘head’ . However, these nouns can be encoun- students took part in Experiment 1. 42 students (negative voltage is plotted upwards). Idiomatic all three suffixes -in(os), -iatik(os) and -isi(os) condition (solid line), Literal condition (dashed tered in contexts that associated to important volunteered in experiment 2. are applied. The reason is that the suffix -in(os) line) and Control condition (dotted line) are com- moments of people’s lives, e.g., ghampriatiko constructs denominal adjectives localizing in 3.2 Materials pared at the onset of the last word of the idiomatic kostoumi ‘bridegroom’s suit’, nyfiatiko traghoudi space and time objectively, i.e., free of prototyp- ‘wedding song’, paidhiatika kamomata ‘childish ical or stereotypical perceptions (Geeraerts, Experiment 1 materials were 90 idiomatic ex- antics’. 1985), contrary to the suffixes -iatik(os) and pressions of similar structure (VP+NP idioms) embedded in sentences. Idioms were selected for 1.3 The -isi(os) suffix -isi(os), that are associated with the individual’s everyday life. Consequently, the derived adjec- being highly Familiar and correctly paraphrased. The suffix -isi(os) is associated with the no- tives are not synonymous, even if the aforemen- Three sentential contexts for each expression tion of ‘provenance’ (Tsopanakis, 1994), which tioned suffixes are attached to the same base, were created so that the last word of the expres- is diachronic in nature, particularly since the suf- e.g., vradino/*vradhiatiko dheltio eidhiseon ‘the sion was highly predictable in the three contexts fix -isi(os) is derived from the latin suffix -ēnsis evening news report’, or to a synonymous base, (above 85% cloze probability). ERPs were time- which is associated with this notion (Meyer, e.g., arnisia/*provatisia païdhakia ‘lamb cut- locked to the presentation of the first word of the 1895). This is a spatial provenance (where the lets’. This is the reason for which only adjectives expression (W1), and epochs comprising W1, base is a proper or common noun referring to the in -in(os) are encountered in scientific and reli- W2 and W3 were extracted from the EEG. In natural landscape or to man-made places (Le Pe- gious discourse, in greater percentages in pre- Experiment 2 a subset of 44 idioms was used. sant, 2011), e.g., vounisios aeras ‘mountain air’, meditated speech on television and the radio, as limnisio psari ‘fish of the lake’); even if the well as in newspapers. This means, seman- base-noun refers to an animal, e.g., arnisia

86 99 Electrophysiological correlates of idiom comprehension: semantic tic/pragmatic factors determine the genre of text cow/cow’s (milk/meat)’: the suffix -in(os) selects where a lexeme may be encountered. It is not by certain properties from the anaphoric/descriptive composition does not follow lexical retrieval chance that the pragmatic feature [learned] is meaning of the base-noun, whereas the suffixes attributed to a suffix found in ancient Greek and -iatik(os) and -isi(os) select from the base-nouns

Paolo Canala,b, Francesca Pesciarellia, Francesco Vespignanic, Nicola Molinarod,e & Cristina Cacciaria the feature [-learned] to suffixes that appeared those properties that correspond to an experien- later, during the Hellenistic era. tial meaning associated with everyday life. We a Department of Biomedical Sciences, Università degli Studi di Modena e Reggio Emilia, Italy can thus explain why the adjectives in -in(os) b NEtS Center for Neurocognition Epistemology and Theorethical Syntax, IUSS, Pavia, Italy 3 Compatibility and -iatik(os) , or those in -in(os) and -isi(os) are e Ikerbasque, Basque Foundation for Science, Bilbao, 48001, Spain not synonyms. c Department of Cognition and Formation Sciences, Università degli Studi di Trento, Italy A categorical as well as semantic and prag- b) The reason why certain suffixes cannot be d BCBL, Basque center on Cognition, Brain and Language, Donostia/San Sebastian, Spain matic compatibility are therefore necessary be- attached to certain base-nouns: compatibility is tween the base-noun and the suffix as well as required between the two. The adjectives in gling between memory retrieval and semantic between the derived noun and the modified noun. -in(os) are likely derived from the [+learned] or 1 Introduction integration processes [e.g., Hoecks & Brower, For instance, there would be an issue of categori- [+/-learned] allomorph of the base-noun, whereas 2014]. cal compatibility if the suffix -in(os) or the suffix Idiomatic expressions, such as break the ice, are the adjectives in -iatik(os) and -isi(os) are de- -iatik(os) were attached to a verb-base. There pervasive in everyday communication. They are rived from the [-learned] or [+/-learned] allo- 2 The present Study would be an issue of semantic compatibility if frequently co-occurring sequences of words with morph of the base-noun, e.g., mesimvrinos and the suffix -in(os) were attached to a non- a conventional meaning that is not derived from We carried out two Experiments in which mesimeriatikos but *mesimvriatikos ‘midday’, temporal/spatial base-noun or if the suffix word-by-word semantic composition, but rather short and literally plausible idioms (e.g., break pedhinos and kampisios but -iatik(os) were attached to a non-temporal base- can be retrieved as such from semantic memory. the ice), i.e. having a literal well-formed meaning *pedhisios /* kampinos ‘of/in a plain’, therinos noun. Lastly, there would be an issue of pragmat- Idioms are often read faster compared to literal and a conventional meaning, were embedded in but * theriatikos ‘of the summer’, heimerinos but ic compatibility if the suffix -in(os) were at- sentences [e.g., Siyanova-Chanturia et al., 2011] literal or idiomatic contexts. Notably, materials *heimeriatikos ‘of the winter’, omfalios and tached to a [-learned] base-noun or if the suffix and also lexical decision times are faster on id- were designed in such way that the sentential afalisios but *omfalisios, *afalios ‘umbilical’. -iatik(os) were attached to a [+learned] base- iom related words than on literal related targets context would constrain expectations on the up- c) The reason why both the adjectives kalo- noun, e.g., if the adjective aniksiatikos ‘of [e.g., Cacciari & Tabossi, 1988]. Recent EEG coming target words to a similar extent across kairinos and kalokairiatikos ‘of the summer’ are spring’ modified the noun isimeria ‘equinox’. data further suggest that semantic composition conditions. By doing so we minimized the im- grammatical without being synonymous: they Therefore, each of the aforementioned suf- processes of idiomatic constituents might be not pact of differential sentence constraints, known both share the [+/-learned] feature. fixes is characterized by their categorical, seman- fully engaged during comprehension [Rommers to elicit N400 effects, and we carried out a com- d) The reason why it is grammatical to say tic, and pragmatic/stylistic specifications and, et al, 2013]. Finally brain-imaging studies re- parison between sentences that were semanti- praghmatika anoiksiatikos kairos ‘real spring according to this “genetic inheritance”, it partici- ported stronger and more widespread activation cally well-formed and for which contextual ex- weather’, praghmatika vounisios aeras ‘real pates in the LCR . Subsequently, within the of the language network when reading idioms pectations on upcoming words were always ful- REL mountain air’, but we do not say *praghmatika framework of Construction Morphology, the no- compared to non-idiomatic sentences [Zempleni filled. Experiment 1 used EEG measures as de- earini isimeria ‘real vernal equinox , tion of compatibility constitutes the key to et al., 2007; Lauro et al., 2008; Boulenger et al., pendant variable to investigate the time course of *praghmatika oreinos oghkos ‘real mountain grammaticality judgements. 2009], suggesting that idiom comprehension idioms comprehension and was followed up by massif’: the adverb praghmatika ‘real/proper’ might involve more cognitive resources. From Experiment 2 in which a cross modal priming 4 Predictions modifies qualifying adjectives but not taxonom- these fragmented results, it is not clear yet how paradigm was implemented, in order to confirm ic/relational ones. idiomatic semantic processing differs from literal the activation of the literal meaning of the idio- Starting from the semantic function of each e) The reason why the suffix -in(os) is se- semantic processing and this might be due to the matic constituents in both types of contexts. suffix at the word-construction level of words lected in utterances that refer to the speaker’s paradoxical nature of idioms [e.g., Libben & Ti- On the basis of the previous ERP literature we that belong to the same onomasiological field, on “HERE and NOW”, within the deictic system: tone, 2008], which seem to be at the same time hypothesized that meaning retrieval processes one hand, similarities as well as differences at adjectives in -in(os) merely denote a location in amenable of direct memory retrieval and word- would affect the N400 component [e.g., Feder- both the semantic and pragmatic level can be space and time; that is, within the NP, they create by-word compositional analysis. meier, 2007]: more demanding retrieval proc- explained. For example, terms such as: kalokair- a temporal or spatial relationship between the The two main questions of the present re- esses should be associated to larger N400 effects. iatikos – kalokairinos ‘of the summer’, kampisi- modified noun and the time period or the loca- search thus concern two aspects of idiom The debate about the role of the N400 in seman- os – pedhinos ‘of/in a plain’; on the other hand, tion signified by the base-noun. Conversely, the comprehension: one relates to how the meaning tic integration vs. retrieval mechanisms [see se- predictions can be formulated, in the sense that suffix -iatik(os) is associated with a subjective, of the whole is retrieved and integrated in the mantic unification processes in Hagoort & Van restrictions are imposed, e.g., avrianos - experiential and/or stereotypical temporal mean- sentence representation; the second relates to Berkum, 2007] makes it hard to exclude that the *avriatikos ‘of tomorrow’, kontinos - *kontaios ing, while the suffix -isi(os) is experientially as- what happens to word-by-word semantic N400 component is not associated with the se- ‘near’ (similarly: mesaios ‘middle’), ghenarisios sociated with the notion of provenance, e.g., composition of the literal meanings of the mantic integration of the meaning of the whole; - *ianouarisios ‘of January’. brostinos - *brostisios ‘of the front’, simerinos – expression: is it carried out or suspended? To however, given the available evidence on figura- According to this model we are able to ex- *simeriatikos, *simerisios ‘of today’. answer these questions we used EEG measures tive language processing, we could also expect plain: f) The reason why the adjectives tritiatikos (with the analysis of Event-Related Potentials an effect on later occurring positivities, previ- a) The reason why it is possible to derive ad- ‘of Tuesday’, tetartiatikos ‘of Wednesday’, and oscillatory dynamics of Time-Frequency ously associated with metaphor (Late Positive jectives with different suffixes from the same pemptiatikos ‘of Thursday’ (and the correspond- representations) because of their temporal Complex, LPC) [e.g., Coulson & Van Petten, base-noun e.g., vradhinos – vradhiatikos ‘of the ing adverbs) are not encountered in written texts: precision [e.g., Luck, 2014], and because of the 2002; Lai et al., 2009] or irony (P600) [Regel et evening’, agheladhisios – agheladhinos ‘of a are they potential or non-grammatical words? possibility of disentangling between memory Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

98 87 According to the theoretical framework followed This study examines the abstract system – in these effects were still absent in second fixation also becomes active when identifying short lexi- throughout this article, the aforementioned words the form of LCRs and the suffixes’ semantic in- duration but emerged in gaze duration. For short calized compounds. are constructed according to the LCR REL and are, struction, which, according to several theories, is compounds, the effects were already visible in A possible theoretical framework that can ac- therefore, potential words. However, they are not homogeneous. However, the present study is second fixation duration. Finally, the measure count for the obtained pattern of results is a dual encountered in written texts due to pragmatic based on actual language use, since it takes into indexing second-pass reading demonstrated a route cascade model assuming that identification factors, as individuals – marking time and de- consideration rich authentic language data within greater second-constituent effect for novel than always starts out with the decomposition route marcating their life according to a sum – in our context, linguistic production of native speakers, existing compounds. All in all, the pattern of re- with the process quickly cascading into the holis- case, a sum of days –, are inclined to pay atten- as well as metalinguistic texts. In particular, the sults suggests that meaning composition takes tic access in the case of short compounds and tion only to the beginning and the end, that is, for study of concordances in the Corpus of Greek place with more delay for long than short com- with some delay in the case of long compounds. people, the days that mark the beginning and the Texts illustrated the breadth of use of derivatives pound words. When a compound word is short, its constituents end of the week are of particular importance. that carry the suffixes in question. are also short and may be accessed rapidly. On Based on what I have stated above, I suggest The homogeneity of the abstract system is the other hand, when the word is long, not only the following categorization of the three suffixes contrasted to the linguistic variety characterizing 3 Conclusions the constituents are likely to be longer and may according to semantic criteria: the use of the system, and simultaneously, it con- thus lengthen their access, but the morphological The present study provided further evidence for stitutes an essential linguistic attribute. segmentation process may also need additional the view (Bertram and Hyönä, 2003, 2013), ac- experiential objective In our case, variety is associated with the time. Hence, the holistic route is activated with cording to which word length modifies the rela- space -isi(os) varying degrees of availability of the suffixes in some delay after the decomposition route is acti- -in(os) tive role of the holistic versus the morphological time -iatik(os) question, as well as with the [+/- learned] feature. vated. The suggested model may be further test- decomposition route in compound word identifi- This simultaneous examination is beneficial to ed by replicating the present study by manipulat- cation. The decomposition route is an integral Table 1: Semantic distribution of the suffixes both, as it bridges the gap between theory and ing the frequency of first constituent separately part in identifying long compound words, be- -iatik(os), -in(os), -isi(os) practice to the extent that one fuels the other. for long and short, novel and lexicalized com- cause holistic processing is not viable due to vis- This is a dynamic, dialectical relationship that pound words. ual acuity constraints. This became apparent in 5 Impact on the theory of derivation explains language change, which has been a top- the effect of the second constituent frequency ic of interest either in the form of borrowing, Every suffix is characterized by their cate- indexing access via morphological constituents during earlier times, or through the non-frequent gorical, semantic, and pragmatic/stylistic specifi- being similar in magnitude for the novel and lex- occurrence of the -isi(os) suffix in contemporary References cations and, according to this “genetic inher- icalized compound words. On the other hand, language. itance”, they participate in the LCRs. Conse- when lexical access via the holistic route is a vi- Bertram, R., & Hyönä, J. (2003). The length of a Furthermore, an association has been at- quently, within the field of Construction Mor- able option, as is the case with short existing complex word modifies the role of morphological tempted between the onomasiological method – phology, the notion of compatibility is key no- compound words that fit in the foveal area of the structure: Evidence from eye movements when which, in our case, originates from the notion of tion for grammaticality judgements. Thus, it eyes when the word is fixated, the novelty effect reading short and long Finnish compounds. Jour- time and space – and the semasiological method. seems to me that it is a bit far-fetched to attribute emerged relatively early (during second fixation) nal of Memory and Language, 48, 615-634. The latter, starting from the form of the suffixes anomalies/exceptions, or even a lack of produc- and the second-constituent frequency effect was Bertram, R., & Hyönä, J. (2013). The role of hyphen -iatik(os) , -in(os) and -isi(os) , focused on the ex- tivity, to lexicon merely because the study of considerably smaller for existing than novel at the constituent boundary in compound word tensive analysis of their semantic instruction, lexicon constitutes unmapped territory (see also compound words during first-pass reading. Final- identification: Facilitative for long, detrimental for unlike other studies that are limited to a basic Anastassiadis-Syméonidis, 2003). ly, the second-pass reading measure demonstrat- short compound words. Experimental Psychology, presentation of semantic features. Similarly, as there is no synonymy between ed a greater effect of constituent frequency for 60, 157-163. Within D. Corbin’s theoretical framework of lexemes, there is neither synonymy between suf- novel than lexicalized compounds. This may be Pollatsek, A., Bertram, R., & Hyönä, J. (2011). Pro- Construction Morphology, meaning occupies a fixes nor between their derivatives, even if the taken to suggest that meaning composition takes cessing novel and lexicalised Finnish compound central role, since the units that contribute to it related suffixes are attached to the same base or longer when the frequency of the second constit- words. Journal of Cognitive Psychology, 23, 795- are meaning-bearing units. The constructed lex- if the same suffix is attached to a synonymous uent is low, since the typical relationships low- 810. emes demand a more complex analysis at the base. frequency constituents are engaged in com- semantic level in comparison to simple ones. The Pollatsek, A., Hyönä, J., & Bertram (2000). The role Lastly, semantic/pragmatic reasons deter- pounds are less firmly established. of morphological constituents in reading Finnish reasons are multiple: (i) because two elements mine the genre of text wherein a derived lexeme There are also two findings that are not com- compound words. Journal of Experimental Psy- participate – the base and the suffix; (ii) because will appear, due to semantic/pragmatic features pletely in line with the visual acuity principle chology: Human Perception and Performance, 26, the suffix is encountered in many other con- of both the base as well as the suffix. proposed by Bertram and Hyönä (2003). One is 820-833. structed lexemes; (iii) because the base is part of the absence of an early novelty effect for short other constructed lexemes with a different suffix; 6 Conclusion compound words. If the holistic route is immedi- and, (iv) because the meaning and the behavior ately activated when making the first fixation on Since the lexicon does not constitute a sepa- at the level of anaphora of constructed lexemes the word, there should have been a novelty effect rate level of linguistic analysis, but horizontaly are associated with their morphological structure. in first fixation duration. Second, there was a 42 cuts through all levels, the properties of those Through implementing this theoretical frame- ms effect of second constituent frequency in gaze levels are to be taken into consideration, that is, work, it was possible to compare the semantic duration even for existing short compound phonological, morphological, syntactic, semantic instruction of the suffixes -iatik(os), -in(os) and words, suggesting that the decomposition route and pragmatic. -isi(os) , the interpretation of semantic similarities and differences between derived words that carry

88 97 route is assumed to be in operation for both word constituent frequency effect was only observed those suffixes, as well as the interpretation of Corbin, D. 1987. Morphologie dérivationnelle et types. for short novel compounds. grammaticality through the notion of compatibil- structuration du lexique. Villeneuve d’Ascq: 2 Adult readers read sentences silently for com- Gaze duration: In gaze duration, summing up ity between the base-noun and the suffix with Presses Universitaires de Lille 1991 . prehension while their eye movements were reg- all fixations made during the first-pass reading, regard to grammatical category, meaning, and Corbin, D. 1991. Introduction – La formation des istered. The target compound words were em- the main effect of word type, word length and pragmatic level. mots: structures et interprétations. Lexique 10, 7- bedded somewhere in the middle of the sentenc- second-constituent frequency were all signifi- 30. es. The frequency of the second constituent as a cant. Gaze duration was significantly longer for Corbin, D. forthcoming. Le lexique construit. separate word was manipulated for short (7-9 novel than existing words, longer for long than Méthodologie d’analyse . letters) lexicalized (e.g., savukala = smoked fish) short words, and longer for compounds contain- References and novel (e.g. hymykisa = smile contest) com- ing a low-frequency than high-frequency second Dictionary of Modern Greek 1933. Athens, newspa- Anastassiadis-Syméonidis, A. 1996. A propos de per Proïa, St . Dimitrakos. pounds as well as for long (12-16 letters) lexical- constituent. Similarly to second fixation dura- l’emprunt suffixal en grec moderne. Cahiers de ized (e.g., hiekkapaperi = sand paper) and novel tion, gaze duration also revealed a reliable three- Lexicologie 68/1, 79-106. Dictionary of Standard Modern Greek 1998. Thessa- (e.g., skandaalivaali = scandal election) com- way interaction between the manipulated factors. loniki, Institute of Modern Greek Studies, Aristotle Anastassiadis-Syméonidis, Α. 2002. Reverse Diction- University of Thessaloniki. pound words. Thus, the experimental design was In order to examine in more detail the interac- ary of Modern Greek. Institute of Modern Greek a 2 (low vs. high frequency second constituent) x tion, it was broken down into two separate 2x2 Studies, Aristotle University of Thessaloniki, Filos, P. 2008. Studies in the Morphology of Latin 2 word type (existing vs. novel) x 2 word length ANOVAs, one for the short and another for the www.komvos.edu.gr , www. greek-language.gr Loanwords into Greek: Evidence from the Papyri. (short vs. long) within-participants design. Com- long compound words. PhD, University of Oxford. prehensibility of the novel compound words was For the long compound words, there was a Anastassiadis-Syméonidis, A. 2003. Inflexion and derivation: myths and truths. Studies on Greek Lin- Geeraerts, D. 1985. Les données stéréotypiques, pro- secured by a rating test conducted prior to the main effect of word type and second constituent guistics – Proceedings of the 24th Annual Meeting totypiques et encyclopédiques dans les diction- experiment proper. Only novel compound words frequency, but no reliable interaction between of the Department of Linguistics, School of Philol- naires. Cahiers de Lexicologie 46/1, 27-43. whose meaning could be computed without them, suggesting that the second-constituent fre- ogy, Faculty of Philosophy, Aristotle University of providing any linguistic context were chosen for quency effect was of similar magnitude for exist- Giannakis, G. 2001. Light is Life, Dark is Death: An Thessaloniki, 43-54. Ancient Greek and Indo-european Metaphor, Do- the study. The frequency of the first constituent ing (an effect size of 91 ms) and novel (an effect Anastassiadis-Syméonidis, A. 2008. Les adjectifs doni-Philologia 30, 127-153. was matched across the conditions, as was the size of 111 ms) compound words. However, for temporels suffixés en -in(os) et -iatik(os) en grec frequency of the short and long existing com- short words, the Word Type x Second- Goutsos, D. 2003. Corpus of Greek Texts (CGT). moderne. In Bernard Fradin (éd.) La raison mor- th pound words. Constituent Frequency proved significant. This Proceedings of the 6 International Conference of phologique - Hommage à la mémoire de Danielle Greek Linguistics, University of Crete, 930-939. interaction reflected the fact that the second- Corbin, Amsterdam, Benjamins, 17-27. 2 Results constituent frequency effect was considerably Gross, G. 1994. Classes d’objets et description des greater for novel (an effect size of 155 ms) than Anastassiadis-Syméonidis, A. 2009. Suffix -isi(os) in verbes, Langages 115, Paris, Larousse, 15-31. Several eye fixation measures were used to tap Modern Greek. Studies in Greek Linguistics – Pro- for existing (an effect size of 42 ms) compound 2 into the time course of compound word pro- ceedings of the 29th Annual Meeting of the De- Haspelmath, M. & Sims, A. 2010 . Understanding words. cessing. The earliest effects were measured by partment of Linguistics, School of Philology, Fac- Morphology. London, Hodder Education. Total fixation time: We also analyzed the ulty of Philosophy, Aristotle University of Thessa- first fixation duration. Early, but less immediate Klairis, Chr. & Babiniotis, G. 2005. Grammar of total fixation time spent reading the target words. loniki, 58-73. effects were measured by second fixation dura- Modern Greek. Structural/functional – communi- This measure indexes late effects; it sums up the tion and gaze duration. Still later effects were Anastassiadis-Syméonidis, A. 2010. Pourquoi une cative. Athens, Ellinika Grammata. duration of all fixations made on the word during measured by total fixation time, which is the sum langue emprunte-t-elle des suffixes ? L’exemple du the first-pass and second-pass reading. In this Kleiber, G. 1990. La sémantique du prototype. Caté- of all fixations, both first-pass and second-pass, grec et du latin. META 55, 1. Mélanges en hom- measure, the three-way interaction obtained for gories et sens lexical, Paris, P.U.F. made on the target word. mage à André Clas, 147-157. second fixation duration and gaze duration was Le Pesant, D. 2001. Les noms locatifs. HDR, Paris First fixation duration: In the earliest stages Anastassiadis-Symeonidis, Α. & Fliatouras, Α. 2004. no longer significant. However, the interaction XIII. of foveal word processing, indexed by first fixa- The distinction between learned and non learned in between word type and second-constituent fre- th tion duration, no effects of novelty or second- Modern Greek. Proceedings of the 6 International Liddell, H. G. & Scott, R. 1996. A Greek-English Lex- quency was almost significant. This interaction constituent frequency were observed. Conference of Greek Linguistics, University of icon, Oxford University Press. reflects the fact that in total fixation time the ef- Second fixation duration: A bit later in the Crete, 110-120. fect of second constituent frequency was greater Meyer, G. 1895. Neugriechische Studien ΙΙΙ . Die processing timeline, main effects of novelty and for novel than existing compound words, regard- Babiniotis, G. 1998, 2002. Dictionary of Modern lateinischen Lehnworte im Neugriechischen. second-constituent frequency were obtained. Greek, Athens, Centre of Lexicology. Sitzungsberichte, usw., Band 132, Vienna. less of word length. The size of the second- These effects were modified by interactions in- constituent frequency effect was 51 ms for the Baldinger, K. 1964. Sémasiologie et onomasiologie, Modern Greek Grammar (of Demotic Greek) 1941. volving word length, including the three-way existing compounds and 151 ms for the novel Revue de linguistique romane 28, 249-272. Thessaloniki, Institute of Modern Greek Studies, interaction. This interaction was broken down by 2 compounds. Aristotle University of Thessaloniki, 1978 . computing a separate 2x2 ANOVA for short and Berthonneau, A.-M. 1989. Composantes linguistiques Summary of results: The following picture long compounds, respectively. These analyses de la référence temporelle. Les compléments de Palmer, L.R. 1946. A Grammar of the Post-Ptolemaic emerges from the pattern of results presented temps. Du lexique à l’énoncé. Thèse d’Etat, Paris Papyri. London, Oxford University Press. revealed no effect of novelty or second- above. In the earliest stages of word processing, VII. constituent frequency for long compounds, Petrounias, Ε. 1998. Dictionary of Standard Modern no signs of either novelty or second-constituent whereas for short compounds both main effects Buck, C. 1933. Comparative Grammar of Greek and Greek [Etymological Part], Thessaloniki, Institute frequency were seen, which suggests that these and their interaction proved significant. The in- Latin, Σικάγο , University of Chicago. of Modern Greek Studies, Aristotle University of effects took some time to develop during com- teraction reflected the fact that the second- Thessaloniki. pound word identification. For long compounds,

96 89 Psaltes, St. 1913. Grammatik der Byzantinischen Identifying Existing and Novel Compound Words in Reading Finnish: Chroniken. Göttingen, Vandenhoeck & Ruprecht. An Eye Movement Study Rey, A. 1992. Dictionnaire historique de la langue française. Paris, Dictionnaires Le Robert, 2 vol. Jukka Hyönä Minna Koski Alexander Pollatsek Taylor, J.R. 1989. Linguistic Categorization – Department of Psychology Department of Psychology Department of Psychology Prototypes in Linguistic Theory. New York, Ox- ford University Press. University of Turku University of Turku University of Massachusetts Turku, Finland Turku, Finland Amherst, MA, USA Temple, M. 1993. Le sens des mots construits : pour [email protected] un traitement dérivationnel associatif, PhD, Uni- [email protected] [email protected] versity of Lille III. email

latest stages of processing during the first-pass 1 Introduction reading, indexed by fixation time spent on the target word after fixating away from the first According to the dual-route race model of com- constituent but before exiting the word, only a pound word identification (Pollatsek, Hyönä, & main effect of novelty was observed. As regards Bertram, 2000), the holistic route and the mor- to the processing of long novel compound words, phological decomposition route operate in tan- the pattern of results was taken to suggest a two- dem. Bertram and Hyönä (2003) posited that stage process. During the first stage, lexical ac- word length modulates the interplay between the cess is achieved for the compound word constit- two access routes. When a compound word is uents. During the second stage, the meaning of sufficiently short so that all or most of its letters the novel compound word is composed out of the fall on the foveal region when fixating it during constituent meanings. The second stage is as- reading, the holistic route gets a head start and sumed to take longer when the frequency of the completes faster than the morphological route first constituent is low, because the prototypical and thus the word is more likely to be identified relationships that the low-frequency first constit- as a whole. On the other hand, when a compound uent would be engaged in compounding are not word is so long that a subset of letters is beyond firmly established. foveal reach, the identification is initiated by first In the present study, we further investigated recognizing the initial constituent followed by the processing of novel and lexicalized Finnish the recognition of the second constituent and that two-noun compound words. This time we ma- of the whole word. nipulated the frequency of the second constituent In their study examining the processing of (the compound head). It was done separately for novel compound words, Pollatsek et al. (2011) existing and novel compound words. Moreover, demonstrated that the decomposition route we also manipulated the length of the compound played even a more prominent role in processing words. If indeed word length strongly determines novel than lexicalized compound words. Pol- the interplay between the holistic and decompo- latsek et al. (2011) compared the processing of sition route in compound word identification, as novel and existing Finnish compound words by argued by Bertram and Hyönä (2003), the ma- manipulating the frequency of first constituent as nipulation of the second-constituent frequency an independent word, separately for long (aver- tapping into the decomposition process should age length of 13 letters) existing and novel com- result in different types of processing especially pound words. The length of the first constituent for short existing versus novel compound words. as well as the frequency of the second constituent Short existing compound words are more likely was matched across conditions. For first fixation be identified by the holistic route, whereas short duration, which indexes early effects in word novel compound words have to be processed via processing, an effect of first-constituent frequen- the morphological decomposition route. For long cy was observed that was similar in size for ex- compound words, on the other hand, the manipu- isting and novel compound words. For gaze du- lation of the second-constituent frequency should ration (i.e. the summed duration of fixations lead to less dramatic differences between exist- made on the word before exiting to the right or ing and novel compounds, as the decomposition left) first-constituent frequency was greater for novel than existing compound words. For the

Copyright © by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

90 95 processing, influencing reading times, and (3) both Michael Tomasello. 2009. Constructing a language: Same same but different: type and typicality can emerge from the “same A usage-based theory of language acquisition. Har- Type and typicality in a distributional model of complement coercion same” distributional model. vard University Press, Cambridge, MA. 1 2 3 Alessandra Zarcone, Jason Utt, and Sebastian Pado.´ Alessandra Zarcone Sebastian Pado´ Alessandro Lenci Acknowledgments 2012. Modeling covert event retrieval in logi- Universitat¨ des Saarlandes Universitat¨ Stuttgart Universita` degli Studi di Pisa cal metonymy: probabilistic and distributional ac- Saarbrucken,¨ Germany Stuttgart, Germany Pisa, Italy This research was funded by the German Research counts. In Proceedings of the 3rd Workshop on [email protected], [email protected], Foundation (DFG) as part of SFB 732 ”Incremen- Cognitive Modeling and Computational Linguistics, [email protected] tal Specification in Context” and SFB 1102 ”Infor- pages 70–79, Montreal,´ Canada. mation Density and Linguistic Encoding”. Alessandra Zarcone, Alessandro Lenci, Sebastian Pado,´ and Jason Utt. 2013. Fitting, not clashing! a distributional semantic model of logical metonymy. Abstract that manipulates both type and typicality. We References In Proceedings of the 10th International Conference on Computational Semantics, Potsdam, Germany. discuss the performance of existing DSMs and a We aim to model the results from a self- Marco Baroni and Alessandro Lenci. 2010. Dis- novel DSM combination. We also discuss how tributional Memory: a general framework for Alessandra Zarcone, Alessandro Lenci, Ken McRae, paced reading experiment, which tested type information can be emerge from distribu- corpus-based semantics. Computational Linguis- and Sebastian Pado.´ in preparation. Type and the- the effect of semantic type clash and typ- tional information. tics, 36(4):673–721. matic fit in logical metonymy resolution. icality on the processing of German com- plement coercion. We present two distri- 2 Manipulating Type and Typicality Klinton Bicknell, Jeffrey L Elman, Mary Hare, Ken butional semantic models to test if they McRae, and Marta Kutas. 2010. Effects of event In a self-paced reading study on German comple- knowledge in processing verbal arguments. Journal can model the effect of both type and typ- of Memory and Language, 63:489–505. icality in the psycholinguistic study. We ment coercion (Zarcone et al., in preparation), we show that one of the models, without ex- have manipulated both type and typicality. The Stefan Evert. 2005. The statistics of word cooccur- plicitly representing type information, can dataset consists of 20 pairs of subjets (S) and as- rences . Ph.D. thesis, Universitat¨ Stuttgart. account both for the effect of type and typ- pectual verbs (V). Each pair is combined with four Argyro Katsika, David Braze, Ashwini Deo, and icality in complement coercion. nominal objects (O) in SOV order: Maria Mercedes Pinango.˜ 2012. Complement [S Das Geburtstagskind] hat [O mit den Geschenken coercion: Distinguishing between type-shifting 1 Introduction: Complement Coercion [S The birthday boy] has [O with the presents and pragmatic inferencing. The Mental Lexicon, / der Feier / der Suppe / der Schicht] [V angefangen]. 7(1):58–76. Complement coercion (The author began the book / party / soup / work shift] [V begun]. reading the book) has been shown to cause an Alessandro Lenci. 2011. Composing and updating → The objects are: a high-typicality entity increase in processing cost (Pylkkanen¨ and McEl- verb argument expectations: A distributional seman- (presents); a high-typicality event (party); a low- tic model. In Proceedings of the 2nd Workshop on ree, 2006; Katsika et al., 2012), which has been as- typicality entity (soup); and a low-typicality event Cognitive Modeling and Computational Linguistics, cribed to a type clash between an event-selecting (work shift). The low-typicality objects are drawn pages 58–66, Portland, OR. verb (begin) and an entity-denoting object (book). from the high-typicality objects of other S-V pairs. The increase in processing costs is found in com- Kazunaga Matsuki, Tracy Chow, Mary Hare, Jeffrey L The self-paced reading study yielded the fol- Elman, Christoph Scheepers, and Ken McRae. parison with a baseline condition, where the same lowing significant effects: (1) an effect of typical- 2011. Event-based plausibility immediately influ- verb is combined with an event-denoting object ity on reading times (t = 2.28, p = .02) at the ences on-line language comprehension. Journal of (journey), which does not trigger a type clash. Experimental Psychology: Learning, Memory, and object region (indicating subject-object integra- A second influence on processing cost is the Cognition, 37(4):913–934. tion), (2) an effect of object type on reading times thematic fit or typicality of the fillers of the verb’s (t = 2.5, p = .01) at the verb region (the region Liina Pylkkanen¨ and Brian McElree. 2006. The argument slots (Bicknell et al., 2010; Matsuki et − of the type clash), (3) an interaction of type and syntax-semantics interface: On-line composition al., 2011): high-typicality combinations are pro- of sentence meaning. In M. Traxler and M. A. thematic fit at the verb region (t = 2.04, p = .04). cessed more quickly than low-typicality ones (the Gernsbacher, editors, Handbook of Psycholinguis- Mean reading times per condition are reported in mechanic checked the brakes / the spelling). tics, pages 539–579. Elsevier, Amsterdam, The Table1. In sum, the study shows that comple- Netherlands, 2nd edition. Distributional semantic models (DSMs) can ment coercion involves both type and typicality. successfully model a range of psycholinguistic David E. Rumelhart and James L. McClelland. 1987. Thus, computational models of complement coer- phenomena, including the effect of typicality Learning the past tenses of English verbs. Implicit cion need to account for both. rules or parallel distributed processing. In Mech- on complement coercion (Zarcone et al., 2012). anisms of language acquisition, pages 249–308. However, they generally do not include a notion 3 Modeling the Experimental Results Lawrence Erlbaum Associates, Hillsdale, NJ. of type. Can a DSM account for effects both of type and typicality? Jenny R Saffran, Richard N Aslin, and Elissa L New- Distributional semantic models (DSMs) repre- port. 1996. Statistical learning by 8-month-old in- In this paper, we consider experimental results sent word meaning as high-dimensional vectors fants. Science, 274(5294):1926–1928. from a study on complement coercion in German recording co-occurrences with elements of their

Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org

94 91 Object region Verb region SV non-compos. ECU JE ECU mit den Geschenken angefangen SO OV SOV+ SOV* SO+OV SO*OV OV with the presents began (1) effect of typicality at the object region (SO interaction) X X X Subject Object Verb × × × high-fit entity 642 819 (2) effect of type at the verb region (type clash) X X X high-fit event 655 736 JE (3) type x thematic fit interaction at the verb region × × × low-fit entity 667 802 SO OV × × × × × × low-fit event 710 806 Figure 1: ECU vs. Joint Expectations for the verb Table 2: Overview of the results of the different DSMs: non-compositional, ECU and JE. Table 1: Mean reading times per condition (in ms) in the self-paced reading study. which is equivalent (by the chain rule), to 4 Discussion: Type and Typicality ECU. We call the models following the ECU procedure SOV+ and SOV*, depending on their We found that the SO model successfully accounts usage contexts. Semantic similarity is defined in P (S)P (O S)P (V O) combination operation (sum and product, respec- | | for the effect of typicality at the object. This is terms of a vector similarity metric such as cosine. tively). Simpler models only consider the influ- not surprising: one of the most typical tasks suc- Distributional Memory (DM, Baroni and Lenci ence of subject or object on the verb (SV and OV Treating the first term as a constant prior, we ob- cessfully performed by distributional models such (2010)) is a DSM that includes syntactic knowl- respectively), just by leaving out the combination tain as ECU is predicting verb-argument plausibility, edge into the word representations. More con- step. These models can successfully account for and ECU had already been successful in modeling P (O S)P (V O) cretely, the TypeDM version of DM records word- reading time results on a dataset of complement | | effects of typicality on reading times in German relation-word tuples w1 r w2 . The tuples are coercion in German that manipulates typicality but complement coercion (Zarcone et al., 2012). h i which we can interpret distributionally as motiva- weighted by Local Mutual Information (Evert, not type (Zarcone et al., 2012). On the other hand, the ECU SOV models were tion to reweight the typicality of the verb given 2005), which can be employed to model predicate- In order to test ECU on a dataset which ma- not able to account for the type–typicality interac- the object with the typicality of the object given argument typicality. For example, the weight of nipulates both type and typicality, we evaluate the tion at the verb. The JE model (SO * OV), which book obj read is higher than label obj read , the subject, thus re-introducing the subject-object h i h i following ECU models on the complement coer- we presented as an alternative to the ECU model obj interaction into the verb prediction (cf. Figure1, which in turn is higher than elephant read . cion data in (Zarcone et al., in preparation): SO to better account for the typicality effects at the h i bottom). TypeDM has been shown to be versatile and effec- to model effects at the object given the subject; verb, yielded effects of both type and typicality at tive in several semantic tasks, including predicting SOV+, SOV* and OV to model effects at the verb. In the Joint Expectation (JE) model, the the- the verb, but did not account for their interaction. matic fit score assigned to the target verb is in- verb-argument plausibility. We expect these models to account for the typi- Our most surprising result is that the OV, SOV*, fluenced both by the verb’s thematic fit with the cality effect at the object (1), but not for the type and SO*OV models explain the effect of type. As 3.1 Complement Coercion and DSMs. object (the verb’s initial thematic fit score, equiva- effects at the verb (2,3). DSMs do not represent this concept explicitly, a DM has been extended into the Expectation Com- lent to the ECU weight for the object obj verb The results are summarized in Table2 (left and h i possible interpretation suggested by our results is position and Update model (ECU, Lenci (2011)), tuple) and by the object’s thematic fit with the middle). In accordance with our prediction, SO that type and typicality are not distinct categories, a family of procedures that can be used to pre- subject (equivalent to the ECU weight for the correctly yields the typicality effect at the object but capture properties of predicate-argument com- dict the typicality of one sentence part given other subject verb object tuple), which in turn is (F = 7.38, p < 0.01). Neither SOV+, SOV*, nor h i binations at different granularity levels. sentence parts. E.g., to model the typicality at used to reweight the verb’s score. OV can model the type-typicality interaction at the Distributional models can account for types be- the verb region in a German sentence with SOV verb (3). Surprisingly, though, SOV* and OV yield Similar to ECU, there is a choice of combi- cause they emerge from the observed corpus distri- word order (e.g. Das Geburtstagskind hat mit dem (2), an effect of type at the verb (F = 5.3228, p < nation operations in JE (sum or product). Since butions. Specifically, for the aspectual verbs used Geschenk angefangen / The birthday boy has with 0.05 and F = 20.388, p < 0.001, respectively). JE can be formulated as a simple wrapper around in the present data set, the distribution over their the present begun), ECU determines the thematic ECU, ECU can be used to compute the individual objects – namely that event nouns occur much fit for the verb given subject and object: Joint Expectations. The reading time study components (e.g. SO, OV, or more complex ones) more frequently that object nouns (Zarcone et al., compute an expectation for the verb given the found that the subject-object typicality effects and these then just need to be combined additively 2013) – corresponds more naturally to an inter- • subject s, as the distribution over verbs v de- linger at the verb, interacting with type. The main (SO+OV) or multiplicatively (SO*OV). pretation in terms of types than of typicality. A fined by the weights of the tuples s subj v shortcoming of ECU is its inability to model the The right-hand side of Table2 shows the results compositional distributional model where seman- h i typicality effects at the verb. This is due to the ar- for JE. SO+OV yields an effect of typicality (F = tic types emerge as patterns of behavior has the compute an expectation for the verb given the • chitecture of the SOV models (cf. Fig.1, top): they 6.777, p < 0.05) but no effect of type (2) or in- advantage of relying on minimal assumptions re- o v object , as the distribution over verbs de- compute the expectations for the verb first from teraction (3). SO*OV yields two main effects of garding the granularity of the type ontology, which fined by the weights of the tuples o obj v . h i the subject (SV) and update them with the object’s (2) type (F = 7.2359, p < 0.05) and typicality (F = is intriguing, as pattern recognition is a key aspect To combine the subject and object expectations, expectations (OV). They ignore the interaction be- 7.2359, p < 0.01), although no interaction (3). of human cognition (Rumelhart and McClelland, we combine the two distributions component by tween subject and object (SO) – the source of typi- Comparing the two models, we see that ECU 1987; Saffran et al., 1996; Tomasello, 2009). component, typically either by sum or products. cality effects (1,3) – corresponding to the assump- SO accounts for the results obtained at the object In conclusion, the picture that emerges from This distribution is then represented in a vector tion that this interaction should only matter at the (1), but the SOV models cannot explain the inter- our experiments is one where (1) expectations for space by computing the centroid or prototype of object. In order to account for this, we draw an action with typicality on the verb (2,3). JE (SO * predicate-argument combinations have a hierar- the vectors of the 20 most expected verbs. Finally, analogy to the concept of joint probability: OV) models the effects of both type (2) and typ- chical structure, with types as a high-level distinc- the thematic fit for a verb v given the subject s and icality at the verb, but does not (yet) account for tion and typicality as a low-level distinction, (2) the object o is its cosine similarity to the centroid. P (S, O, V ) their interaction (3). both levels are different, but interact early during

92 93