Linguistic Structure in Statistical Machine Translation

Total Page:16

File Type:pdf, Size:1020Kb

Linguistic Structure in Statistical Machine Translation Linguistic Structure in Statistical Machine Translation zur Erlangung des akademischen Grades eines Doktors der Ingenieurwissenschaften der Fakult¨atf¨urInformatik des Karlsruher Instituts f¨urTechnologie (KIT) genehmigte Dissertation von Teresa Herrmann aus Dresden Tag der m¨undlichen Pr¨ufung: 19. Juni 2015 Erster Gutachter: Prof. Dr. Alexander Waibel Zweiter Gutachter: Dr. Stephan Vogel Abstract The first approaches to machine translation required linguists or language experts writing language-dependent translation rules by hand. In the last 20 years, statistical approaches applying machine learning techniques have be- come the state-of-the-art. They automatically learn translation rules from large parallel corpora of existing translations. This facilitates fast develop- ment of translation systems for new languages without the need for language experts writing rules. In phrase-based machine translation, new translations are constructed from phrases seen in previous translations during training. In most cases, this flat composition of phrases leads to good results, espe- cially for translations between languages with similar sentence structure. However, this strong suit of statistical translation exposes a weakness, if there are structural di↵erences between the languages. German is one of the languages where structural changes are necessary during translation, which increases the difficulty for translation. A particular challenge is the position of the verb in the German sentence, which depends on various factors that need to be taken into consideration when translating from or into German. The knowledge about sentence structure could provide helpful information for the translation process. This thesis investigates the influence of linguistic structure in statistical machine translation. The first part of this thesis addresses the particu- lar challenge that di↵erences in word order between languages present for translation. We develop a word reordering model based on the structural information inherent in phrase structure trees. This reordering model is included as a separate component in a phrase-based machine translation system. It consists of rules on how to change the word order when translat- ing from one language into another. These rule are automatically learned from a parallel corpus that was annotated with syntactic parse trees for the source language. These rules define the search space for possible reorderings of the source sentence during translation. In comparison to a part-of-speech- based reordering model alone, the combination of the two models achieves an improvement of the word order in the target language, which can be further increased when including a lexicalized reordering model. Hence, the di↵erent word reordering models operating on di↵erent linguistic lev- els (words, parts-of-speech and syntax) have shown complementary e↵ects which can be increased further when combined. Measuring translation quality with regard to word order is difficult with automatic evaluation metrics. We perform a manual evaluation of the re- ordering approach on three di↵erent data sets representing di↵erent gen- res. Although the amount of a↵ected sentences varies between genres, the manual evaluation of the translation quality on German-English translation confirms consistent improvements on all three data sets. The improvements introduced by the syntactic reordering model consist of translations of words that were removed from the translation before, as well as improved positions of words and whole constituents in the translated sentence. As intended in the design of the syntactic reordering model, verbs are the most a↵ected word category. Furthermore, we analyzed the potential and limits of the syntax-based reordering approach with experiments regarding the performance of the syntax-based and the part-of-speech-based reordering approaches. Oracle experiments revealed the maximal improvement that can be achieved with the syntax-based reordering model. First, the upper bound was determined for the source sentence reordering approach in general. This was compared with the highest achievable performance of the syntax-based approach and its actual performance generated by the decoder. It can be shown that the syntax-based reordering rules improve the search space of available reorder- ing possibilities. More correct reordering options are generated and when translating from German to English those are also chosen frequently for translation. For translation from English into German, there is potential for improving the search among the reordering options to generate better translations. The experiments also indicate that the reordering model would benefit from additional rules to expand the search space further in order to approximate the word order even better. In the second part of the thesis we address the issues of translating pro- nouns and improving the morphological agreement for morphologically rich languages in statistical machine translation. For disambiguating between all possible translation options for a word, context information is needed. Especially when morphological agreement between words such as pronouns and their antecedent, or subjects and verbs is required, the dependencies between words are also important. A source discriminative word lexicon (SDWL) is developed for predicting the translation for individual source words in their respective contexts. The prediction is performed as a classification task, where the context words and dependency relations of the given source word are used as structural features of the source sentence to guide the translation prediction. The evaluation of the prediction accuracy shows improved translation prediction by the SDWL over a baseline classifier, especially for pronouns, subjects and verbs. The translation predictions for individual words are combined on the sentence level and used in N-best list re-ranking. According to automatic evaluation, the translation quality after re-ranking is improved. Zusammenfassung In den urspr¨unglichen Ans¨atzen zur maschinellen Ubersetzung¨ waren Lin- guisten oder Sprachexperten notwendig, die sprachspezifische Ubersetzungs-¨ regeln per Hand schreiben mussten. In den letzten 20 Jahren sind statis- tische Ans¨atze zur maschinellen Ubersetzung¨ zum State-of-the-Art gewor- den. Dabei werden mit Hilfe von Verfahren aus dem Maschinellen Lernen Ubersetzungsregeln¨ automatisch aus großen Korpora bestehender Uberset-¨ zungen gelernt. Diese Vorgehensweise erm¨oglicht die schnelle Entwick- lung von Ubersetzungssystemen¨ f¨ur neue Sprachen, ohne dass Experten- wissen notwendig ist. Das Zusammensetzen einer neuen Ubersetzung¨ aus Mehrwortphrasen im Training gesehener Ubersetzungsbl¨ocke¨ f¨uhrt in den meisten F¨allen zu guten Ubersetzungen,¨ besonders wenn die Ubersetzung¨ zwischen Sprachen erfolgt, deren Satzbau einer ¨ahnlichen Struktur folgt. Wenn Unterschiede in der Sprachstruktur zwischen den Sprachen bestehen, stellt dies allerdings eine Schw¨ache dar. Deutsch ist eine der Sprachen, bei denen w¨ahrend der Ubersetzung¨ strukturelle Anderungen¨ notwendig sind, die den Ubersetzungsprozess¨ erschweren. Eine besondere Schwierig- keit im Deutschen stellt die Stellung des Verbs im Satz dar, welche von verschiedenen Faktoren abh¨angt, die bei Ubersetzungen¨ ins oder aus dem Deutschen beachtet werden m¨ussen. Die Kenntnis der sprachlichen Struktur des Satzes kann jedoch f¨ur die Ubersetzung¨ hilfreiche Informationen liefern. In dieser Arbeit wird der Einfluss von linguistischer Struktur in der statis- tischen maschinellen Ubersetzung¨ untersucht. Unterschiede in der Wortstel- lung zwischen Sprachen stellen f¨ur die Ubersetzung¨ eine besondere Schwierig- keit dar. Im ersten Teil der Arbeit wird eine Komponente f¨ur die Mod- ellierung von Wortumordnungen entwickelt, die auf linguistischen Struk- turinformation aus Phrasenstrukturb¨aumen basiert. Die Regeln f¨ur die Wortstellungs¨anderungen werden automatisch aus einem parallelen Kor- pus, dessen Quellsprachseite mit syntaktischen Phrasenstrukturb¨aumen an- notiert wurde, gelernt. Die gelernten Regeln definieren den Suchraum der m¨oglichen Umordnungen des Quellsatzes f¨ur m¨ogliche Ubersetzungen.¨ Gegen- ¨uber einem Umordnungsmodell das nur auf Wortkategorien basiert, k¨onnen durch die Kombination der beiden Modelle Verbesserungen in der Wortstel- lung erzielt werden, die noch gesteigert werden k¨onnen durch das weitere Hinzuf¨ugen eines lexikalischen Umordnungsmodells. Es kann somit gezeigt werden, dass die verwendeten Modelle zur Modellierung der Wortstellung in der Ubersetzung¨ zueinander komplement¨are E↵ekte haben, die sich zur weiteren Steigerung der Wirksamkeit kombinieren lassen. Die Ubersetzungsqualit¨atin¨ Bezug auf die Wortstellung ist mit automatis- chen Evaluationsmetriken schwer zu bewerten. Daher wird in der Arbeit eine manuelle Evaluation des Umordnungsansatzes auf drei verschiedenen Textgenres und Stilrichtungen durchgef¨uhrt. Obwohl die Anzahl der jeweils betro↵enen und analysierten S¨atze variiert, k¨onnen in einer manuellen Eval- uation der Ubersetzungsqualit¨atauf¨ Satzebene konsistente Verbesserungen auf allen drei Datens¨atzen nachgewiesen werden. Die Verbesserungen, die das Syntaxmodell einbringt, bestehen aus Ubersetzungen¨ f¨ur W¨orter, die vorher aus der Ubersetzung¨ entfernt wurden, sowie die verbesserte Position von einzelnen W¨ortern und ganzen Satzkonstituenten im ¨ubersetzten Satz. Wie im Entwurf des syntaktischen Umordnungsmodells vorgesehen, sind Verben die haupts¨achlich betro↵ene
Recommended publications
  • Why Is Language Typology Possible?
    Why is language typology possible? Martin Haspelmath 1 Languages are incomparable Each language has its own system. Each language has its own categories. Each language is a world of its own. 2 Or are all languages like Latin? nominative the book genitive of the book dative to the book accusative the book ablative from the book 3 Or are all languages like English? 4 How could languages be compared? If languages are so different: What could be possible tertia comparationis (= entities that are identical across comparanda and thus permit comparison)? 5 Three approaches • Indeed, language typology is impossible (non- aprioristic structuralism) • Typology is possible based on cross-linguistic categories (aprioristic generativism) • Typology is possible without cross-linguistic categories (non-aprioristic typology) 6 Non-aprioristic structuralism: Franz Boas (1858-1942) The categories chosen for description in the Handbook “depend entirely on the inner form of each language...” Boas, Franz. 1911. Introduction to The Handbook of American Indian Languages. 7 Non-aprioristic structuralism: Ferdinand de Saussure (1857-1913) “dans la langue il n’y a que des différences...” (In a language there are only differences) i.e. all categories are determined by the ways in which they differ from other categories, and each language has different ways of cutting up the sound space and the meaning space de Saussure, Ferdinand. 1915. Cours de linguistique générale. 8 Example: Datives across languages cf. Haspelmath, Martin. 2003. The geometry of grammatical meaning: semantic maps and cross-linguistic comparison 9 Example: Datives across languages 10 Example: Datives across languages 11 Non-aprioristic structuralism: Peter H. Matthews (University of Cambridge) Matthews 1997:199: "To ask whether a language 'has' some category is...to ask a fairly sophisticated question..
    [Show full text]
  • Integrating Optical Character Recognition and Machine
    Integrating Optical Character Recognition and Machine Translation of Historical Documents Haithem Afli and Andy Way ADAPT Centre School of Computing Dublin City University Dublin, Ireland haithem.afli, andy.way @adaptcentre.ie { } Abstract Machine Translation (MT) plays a critical role in expanding capacity in the translation industry. However, many valuable documents, including digital documents, are encoded in non-accessible formats for machine processing (e.g., Historical or Legal documents). Such documents must be passed through a process of Optical Character Recognition (OCR) to render the text suitable for MT. No matter how good the OCR is, this process introduces recognition errors, which often renders MT ineffective. In this paper, we propose a new OCR to MT framework based on adding a new OCR error correction module to enhance the overall quality of translation. Experimenta- tion shows that our new system correction based on the combination of Language Modeling and Translation methods outperforms the baseline system by nearly 30% relative improvement. 1 Introduction While research on improving Optical Character Recognition (OCR) algorithms is ongoing, our assess- ment is that Machine Translation (MT) will continue to produce unacceptable translation errors (or non- translations) based solely on the automatic output of OCR systems. The problem comes from the fact that current OCR and Machine Translation systems are commercially distinct and separate technologies. There are often mistakes in the scanned texts as the OCR system occasionally misrecognizes letters and falsely identifies scanned text, leading to misspellings and linguistic errors in the output text (Niklas, 2010). Works involved in improving translation services purchase off-the-shelf OCR technology but have limited capability to adapt the OCR processing to improve overall machine translation perfor- mance.
    [Show full text]
  • Machine Translation for Academic Purposes Grace Hui-Chin Lin
    Proceedings of the International Conference on TESOL and Translation 2009 December 2009: pp.133-148 Machine Translation for Academic Purposes Grace Hui-chin Lin PhD Texas A&M University College Station Master of Science, University of Southern California Paul Shih Chieh Chien Center for General Education, Taipei Medical University Abstract Due to the globalization trend and knowledge boost in the second millennium, multi-lingual translation has become a noteworthy issue. For the purposes of learning knowledge in academic fields, Machine Translation (MT) should be noticed not only academically but also practically. MT should be informed to the translating learners because it is a valuable approach to apply by professional translators for diverse professional fields. For learning translating skills and finding a way to learn and teach through bi-lingual/multilingual translating functions in software, machine translation is an ideal approach that translation trainers, translation learners, and professional translators should be familiar with. In fact, theories for machine translation and computer assistance had been highly valued by many scholars. (e.g., Hutchines, 2003; Thriveni, 2002) Based on MIT’s Open Courseware into Chinese that Lee, Lin and Bonk (2007) have introduced, this paper demonstrates how MT can be efficiently applied as a superior way of teaching and learning. This article predicts the translated courses utilizing MT for residents of global village should emerge and be provided soon in industrialized nations and it exhibits an overview about what the current developmental status of MT is, why the MT should be fully applied for academic purpose, such as translating a textbook or teaching and learning a course, and what types of software can be successfully applied.
    [Show full text]
  • Modeling Language Variation and Universals: a Survey on Typological Linguistics for Natural Language Processing
    Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing Edoardo Ponti, Helen O ’Horan, Yevgeni Berzak, Ivan Vulic, Roi Reichart, Thierry Poibeau, Ekaterina Shutova, Anna Korhonen To cite this version: Edoardo Ponti, Helen O ’Horan, Yevgeni Berzak, Ivan Vulic, Roi Reichart, et al.. Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing. 2018. hal-01856176 HAL Id: hal-01856176 https://hal.archives-ouvertes.fr/hal-01856176 Preprint submitted on 9 Aug 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing Edoardo Maria Ponti∗ Helen O’Horan∗∗ LTL, University of Cambridge LTL, University of Cambridge Yevgeni Berzaky Ivan Vuli´cz Department of Brain and Cognitive LTL, University of Cambridge Sciences, MIT Roi Reichart§ Thierry Poibeau# Faculty of Industrial Engineering and LATTICE Lab, CNRS and ENS/PSL and Management, Technion - IIT Univ. Sorbonne nouvelle/USPC Ekaterina Shutova** Anna Korhonenyy ILLC, University of Amsterdam LTL, University of Cambridge Understanding cross-lingual variation is essential for the development of effective multilingual natural language processing (NLP) applications.
    [Show full text]
  • Urdu Treebank
    Sci.Int.(Lahore),28(4),3581-3585, 2016 ISSN 1013-5316;CODEN: SINTE 8 3581 URDU TREEBANK Muddassira Arshad, Aasim Ali Punjab University College of Information Technology (PUCIT), University of the Punjab, Lahore [email protected], [email protected] (Presented at the 5th International. Multidisciplinary Conference, 29-31 Oct., at, ICBS, Lahore) ABSTRACT: Treebank is a parsed corpus of text annotated with the syntactic information in the form of tags that yield the phrasal information within the corpus. The first outcome of this study is to design a phrasal and functional tag set for Urdu by minimal changes in the tag set of Penn Treebank, that has become a de-facto standard. Our tag set comprises of 22 phrasal tags and 20 functional tags. Second outcome of this work is the development of initial Treebank for Urdu, which remains publically available to the research and development community. There are 500 sentences of Urdu translation of religious text with an average length of 12 words. Context free grammar containing 109 rules and 313 lexical entries, for Urdu has been extracted from this Treebank. An online parser is used to verify the coverage of grammar on 50 test sentences with 80% recall. However, multiple parse trees are generated against each test sentence. 1 INTRODUCTION grammars for parsing[5], where statistical models are used The term "Treebank" was introduced in 2003. It is defined as for broad-coverage parsing [6]. "linguistically annotated corpus that includes some In literature, various techniques have been applied for grammatical analysis beyond the part of speech level" [1].
    [Show full text]
  • Machine Translation for Language Preservation
    Machine translation for language preservation Steven BIRD1,2 David CHIANG3 (1) Department of Computing and Information Systems, University of Melbourne (2) Linguistic Data Consortium, University of Pennsylvania (3) Information Sciences Institute, University of Southern California [email protected], [email protected] ABSTRACT Statistical machine translation has been remarkably successful for the world’s well-resourced languages, and much effort is focussed on creating and exploiting rich resources such as treebanks and wordnets. Machine translation can also support the urgent task of document- ing the world’s endangered languages. The primary object of statistical translation models, bilingual aligned text, closely coincides with interlinear text, the primary artefact collected in documentary linguistics. It ought to be possible to exploit this similarity in order to improve the quantity and quality of documentation for a language. Yet there are many technical and logistical problems to be addressed, starting with the problem that – for most of the languages in question – no texts or lexicons exist. In this position paper, we examine these challenges, and report on a data collection effort involving 15 endangered languages spoken in the highlands of Papua New Guinea. KEYWORDS: endangered languages, documentary linguistics, language resources, bilingual texts, comparative lexicons. Proceedings of COLING 2012: Posters, pages 125–134, COLING 2012, Mumbai, December 2012. 125 1 Introduction Most of the world’s 6800 languages are relatively unstudied, even though they are no less im- portant for scientific investigation than major world languages. For example, before Hixkaryana (Carib, Brazil) was discovered to have object-verb-subject word order, it was assumed that this word order was not possible in a human language, and that some principle of universal grammar must exist to account for this systematic gap (Derbyshire, 1977).
    [Show full text]
  • Machine Translation and Computer-Assisted Translation: a New Way of Translating? Author: Olivia Craciunescu E-Mail: Olivia [email protected]
    Machine Translation and Computer-Assisted Translation: a New Way of Translating? Author: Olivia Craciunescu E-mail: [email protected] Author: Constanza Gerding-Salas E-mail: [email protected] Author: Susan Stringer-O’Keeffe E-mail: [email protected] Source: http://www.translationdirectory.com The Need for Translation IT has produced a screen culture that Technology tends to replace the print culture, with printed documents being dispensed Advances in information technology (IT) with and information being accessed have combined with modern communication and relayed directly through computers requirements to foster translation automation. (e-mail, databases and other stored The history of the relationship between information). These computer documents technology and translation goes back to are instantly available and can be opened the beginnings of the Cold War, as in the and processed with far greater fl exibility 1950s competition between the United than printed matter, with the result that the States and the Soviet Union was so intensive status of information itself has changed, at every level that thousands of documents becoming either temporary or permanent were translated from Russian to English and according to need. Over the last two vice versa. However, such high demand decades we have witnessed the enormous revealed the ineffi ciency of the translation growth of information technology with process, above all in specialized areas of the accompanying advantages of speed, knowledge, increasing interest in the idea visual
    [Show full text]
  • Linguistic Profiles: a Quantitative Approach to Theoretical Questions
    Laura A. Janda USA – Norway, Th e University of Tromsø – Th e Arctic University of Norway Linguistic profi les: A quantitative approach to theoretical questions Key words: cognitive linguistics, linguistic profi les, statistical analysis, theoretical linguistics Ключевые слова: когнитивная лингвистика, лингвистические профили, статисти- ческий анализ, теоретическая лингвистика Abstract A major challenge in linguistics today is to take advantage of the data and sophisticated analytical tools available to us in the service of questions that are both theoretically inter- esting and practically useful. I offer linguistic profi les as one way to join theoretical insight to empirical research. Linguistic profi les are a more narrowly targeted type of behavioral profi ling, focusing on a specifi c factor and how it is distributed across language forms. I present case studies using Russian data and illustrating three types of linguistic profi ling analyses: grammatical profi les, semantic profi les, and constructional profi les. In connection with each case study I identify theoretical issues and show how they can be operationalized by means of profi les. The fi ndings represent real gains in our understanding of Russian grammar that can also be utilized in both pedagogical and computational applications. 1. Th e quantitative turn in linguistics and the theoretical challenge I recently conducted a study of articles published since the inauguration of the journal Cognitive Linguistics in 1990. At the year 2008, Cognitive Linguistics took a quanti- tative turn; since that point over 50% of the scholarly articles that journal publishes have involved quantitative analysis of linguistic data [Janda 2013: 4–6]. Though my sample was limited to only one journal, this trend is endemic in linguistics, and it is motivated by a confl uence of historic factors.
    [Show full text]
  • Language Service Translation SAVER Technote
    TechNote August 2020 LANGUAGE TRANSLATION APPLICATIONS The U.S. Department of Homeland Security (DHS) AEL Number: 09ME-07-TRAN established the System Assessment and Validation for Language translation equipment used by emergency services has evolved into Emergency Responders commercially available, mobile device applications (apps). The wide adoption of (SAVER) Program to help cell phones has enabled language translation applications to be useful in emergency responders emergency scenarios where first responders may interact with individuals who improve their procurement speak a different language. Most language translation applications can identify decisions. a spoken language and translate in real-time. Applications vary in functionality, Located within the Science such as the ability to translate voice or text offline and are unique in the number and Technology Directorate, of available translatable languages. These applications are useful tools to the National Urban Security remedy communication barriers in any emergency response scenario. Technology Laboratory (NUSTL) manages the SAVER Overview Program and conducts objective operational During an emergency response, language barriers can challenge a first assessments of commercial responder’s ability to quickly assess and respond to a situation. First responders equipment and systems must be able to accurately communicate with persons at an emergency scene in relevant to the emergency order to provide a prompt, appropriate response. The quick translations provided responder community. by mobile apps remove the language barrier and are essential to the safety of The SAVER Program gathers both first responders and civilians with whom they interact. and reports information about equipment that falls within the Before the adoption of smartphones, first responders used language interpreters categories listed in the DHS and later language translation equipment such as hand-held language Authorized Equipment List translation devices (Figure 1).
    [Show full text]
  • Terminology Extraction, Translation Tools and Comparable Corpora Helena Blancafort, Béatrice Daille, Tatiana Gornostay, Ulrich Heid, Claude Méchoulam, Serge Sharoff
    TTC: Terminology Extraction, Translation Tools and Comparable Corpora Helena Blancafort, Béatrice Daille, Tatiana Gornostay, Ulrich Heid, Claude Méchoulam, Serge Sharoff To cite this version: Helena Blancafort, Béatrice Daille, Tatiana Gornostay, Ulrich Heid, Claude Méchoulam, et al.. TTC: Terminology Extraction, Translation Tools and Comparable Corpora. 14th EURALEX International Congress, Jul 2010, Leeuwarden/Ljouwert, Netherlands. pp.263-268. hal-00819365 HAL Id: hal-00819365 https://hal.archives-ouvertes.fr/hal-00819365 Submitted on 2 May 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. TTC: Terminology Extraction, Translation Tools and Comparable Corpora Helena Blancafort, Syllabs, Universitat Pompeu Fabra, Barcelona, Spain Béatrice Daille, LINA, Université de Nantes, France Tatiana Gornostay, Tilde, Latva Ulrich Heid, IMS, Universität Stuttgart, Germany Claude Mechoulam, Sogitec, France Serge Sharoff, CTS, University of Leeds, England The need for linguistic resources in any natural language application is undeniable. Lexicons and terminologies
    [Show full text]
  • Morphological Processing in the Brain: the Good (Inflection), the Bad (Derivation) and the Ugly (Compounding)
    Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding) Article Published Version Creative Commons: Attribution 4.0 (CC-BY) Open Access Leminen, A., Smolka, E., Duñabeitia, J. A. and Pliatsikas, C. (2019) Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding). Cortex, 116. pp. 4-44. ISSN 0010-9452 doi: https://doi.org/10.1016/j.cortex.2018.08.016 Available at http://centaur.reading.ac.uk/78769/ It is advisable to refer to the publisher’s version if you intend to cite from the work. See Guidance on citing . To link to this article DOI: http://dx.doi.org/10.1016/j.cortex.2018.08.016 Publisher: Elsevier All outputs in CentAUR are protected by Intellectual Property Rights law, including copyright law. Copyright and IPR is retained by the creators or other copyright holders. Terms and conditions for use of this material are defined in the End User Agreement . www.reading.ac.uk/centaur CentAUR Central Archive at the University of Reading Reading’s research outputs online cortex 116 (2019) 4e44 Available online at www.sciencedirect.com ScienceDirect Journal homepage: www.elsevier.com/locate/cortex Special issue: Review Morphological processing in the brain: The good (inflection), the bad (derivation) and the ugly (compounding) Alina Leminen a,b,*,1, Eva Smolka c,1, Jon A. Dunabeitia~ d,e and Christos Pliatsikas f a Cognitive Science, Department of Digital Humanities, Faculty of Arts, University of Helsinki, Finland b Cognitive Brain Research
    [Show full text]
  • Inflection), the Bad (Derivation) and the Ugly (Compounding)
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Central Archive at the University of Reading Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding) Article Published Version Creative Commons: Attribution 4.0 (CC-BY) Open Access Leminen, A., Smolka, E., Duñabeitia, J. A. and Pliatsikas, C. (2019) Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding). Cortex, 116. pp. 4-44. ISSN 0010-9452 doi: https://doi.org/10.1016/j.cortex.2018.08.016 Available at http://centaur.reading.ac.uk/78769/ It is advisable to refer to the publisher's version if you intend to cite from the work. See Guidance on citing . To link to this article DOI: http://dx.doi.org/10.1016/j.cortex.2018.08.016 Publisher: Elsevier All outputs in CentAUR are protected by Intellectual Property Rights law, including copyright law. Copyright and IPR is retained by the creators or other copyright holders. Terms and conditions for use of this material are defined in the End User Agreement . www.reading.ac.uk/centaur CentAUR Central Archive at the University of Reading Reading's research outputs online cortex 116 (2019) 4e44 Available online at www.sciencedirect.com ScienceDirect Journal homepage: www.elsevier.com/locate/cortex Special issue: Review Morphological processing in the brain: The good (inflection), the bad (derivation) and the ugly (compounding) Alina Leminen a,b,*,1, Eva Smolka c,1, Jon A. Dunabeitia~ d,e and Christos Pliatsikas
    [Show full text]