Simplenlg-IT: Adapting Simplenlg to Italian

Total Page:16

File Type:pdf, Size:1020Kb

Simplenlg-IT: Adapting Simplenlg to Italian SimpleNLG-IT: adapting SimpleNLG to Italian Alessandro Mazzei and Cristina Battaglino and Cristina Bosco Dipartimento di Informatica Universita` degli Studi di Torino Corso Svizzera 185, 10149 Torino [mazzei,battagli,bosco]@di.unito.it Abstract guage specific features, like word order, inflection and selection of functional words. This paper describes the SimpleNLG-IT re- Surface realisers can be classified on the basis of aliser, i.e. the main features of the porting of the SimpleNLG API system (Gatt and Reiter, their input. Fully fledged realisers accept as input an 2009) to Italian. The paper gives some details unordered and uninflected proto-syntactic structure about the grammar and the lexicon employed enriched with semantic and pragmatic features that by the system and reports some results about are used to produce the most plausible output string. a first evaluation based on a dependency tree- OpenCCG is a member of this category of realisers bank for Italian. A comparison is developed (White, 2006). Indeed, OpenCCG accepts as input with the previous projects developed for this a semantic graph representing a set of hybrid logic task for English and French, which is based on formulas. The hybrid logic elements are indeed the morpho-syntactical differences and simi- larities between Italian and these languages. the semantic specification of syntactic CCG struc- tures defined in the grammar realiser. The semantic graph under-specifies morpho-syntactic information 1 Introduction and delegates to the realiser many lexical and syn- tactic choices (e.g. function words). Chart based Natural Language Generation (NLG) involves a algorithms and statistical models are used to resolve number of elementary tasks that can be addressed by the ambiguity arising from under-specification. using different approaches and architectures. A well defined standard architecture is the pipeline pro- In contrast, realisation engines are simpler sys- posed by Reiter and Dale (Reiter and Dale, 2000). tems which perform just linearisation and mor- In this approach, three steps transform raw data phological inflections of the proto-syntactic input. into natural language text, that are: document As a consequence, realization engine presumes a planning, sentence-planning and surface realization. more detailed morpho-syntactic information as in- Each one of these modules triggers the next one ad- put. A member of this category is SimpleNLG dressing a distinct issue as follows. In document (Gatt and Reiter, 2009). It assumes a complete planning the user decides the information content of syntactic specification, but unordered and unin- the text to be generated (what to say). In sentence- flected, of the sentence in the form of a mixed con- planning, the focus is instead on the design of a stituency/dependency structure. Content and func- number of features that are related to the informa- tion words are chosen in input as well as modifiers tion contents as well as to the specific language, as order. The greatest advantage of this system is its the choice of the words. Finally, in surface realisa- simplicity, which allows to pay more efforts in the tion, sentences are generated according to the deci- previous stages of the NLG pipeline. sions taken in the previous stages and by fulfilling SimpleNLG was originally designed for English the morpho-syntactic constraints related to the lan- but it has been successively adapted to German, 184 Proceedings of The 9th International Natural Language Generation conference, pages 184–192, Edinburgh, UK, September 5-8 2016. c 2016 Association for Computational Linguistics French, Brazilian-Portuguese and Telugu (Boll- describe the grammar defined by the system, that mann, 2011; Vaudry and Lapalme, 2013; de Oliveira has been developed starting from the SimpleNLG- and Sripada, 2014; Dokkara et al., 2015). The first EnFr1.1. grammar; in Section 3 we describe the contribution of this paper is the adaptation of Sim- lexicon adopted, that has been built starting from pleNLG for Italian1. The most challenging issues three lexical resources available for Italian; Sec- under this respect of this project (see Sections 2 and tion 4 describes the evaluation of the system, which 3) are: (1) the Italian verb conjugation system, that is based on examples from both grammar books (Pa- cannot be easily mapped to the English system and tota, 2006) and an Italian treebank (Nivre et al., shows many idiosyncrasies; (2) the high complex- 2016); finally, Section 5 closes the paper with some ity of the Italian morphological inflections; (3) the final considerations and pointing to future works. lack of a publicly available computational lexicon suitable for generation. Nevertheless, the contribu- 2 From French to Italian grammar tion of this paper goes beyond the adaptation of the existing implementation to a novel language. We In this Section we will focus on the generation of applied indeed a treebank-based methodology (see constituents and on their order within the sentence the monolingual and multilingual resources cited be- in Italian. In this achievement, the main reference low) for both evaluating our results (see sec. 4), and for Italian grammar is (Patota, 2006). In general, describing in a comparative perspective the features it must be observed that Italian, like French, is fea- of the implemented grammar, referring to the dif- tured by a rich inflection that is clearly attested by ferences between Italian, French and English. This Verbs, but also by the behavior of other grammati- makes the work more linguistically sound and data- cal categories whose morphosyntactic features (e.g. driven. We started our work from SimpleNLG- gender, number and case) are crucial for determin- EnFr1.1, that is an adaptation to French (Vaudry and ing their syntactical order in the phrases to be gener- Lapalme, 2013) of the model developed for English ated. As stated above, our approach is based on that in (Gatt and Reiter, 2009). A property of our project adopted for French in (Vaudry and Lapalme, 2013), is multilingualism: by using the same architecture which has been in turn inspired by that used for En- of SimpleNLG-EnFr1.1 we are able to multilingual glish (Gatt and Reiter, 2009). First of all, we devel- documents with sentences in English, French and oped therefore a comparison among Italian and these Italian. other two languages in order to detect the main novel features to be taken into account in the development In porting SimpleNLG-EnFr1.1 to SimpleNLG- 2 IT, we created 10 new packages and modified 28 of SimpleNLG-IT. The parallel treebank ParTUT existing classes. The morphology and morphonol- developed for Italian/French/English helped us in ogy processors needed to be written from scratch this comparison. because of the features that differentiate Italian with In the rest of this Section we organize these fea- respect to French and English. The Syntax proces- tures in the main classes which did drive the pro- sor needed to be adapted, especially for the man- cesses we implemented: morphology and syntax, agement of noun and verb phrases and for clauses. which are strictly interrelated because of the concor- However, at this stage, we used the same orthog- dance phenomena, and morphonology. raphy processor of French. We needed to extend the system with 33 new lexical features, necessary 2.1 Morphology and syntax for accounting verb irregularities (subjunctive, con- 2.1.1 Verb conjugation ditional, remote past, etc.) and for processing the Italian is featured by a complexity of inflection superlative irregular form of the adjectives. which is typical of morphologically rich languages In the next Sections we survey the main features and its richness, in this perspective, positively com- of SimpleNLG-IT, in particular: in Section 2 we pares with that of French. Nevertheless, in order to 1SimpleNLG-IT: https://github.com/ 2http://www.di.unito.it/˜tutreeb/ alexmazzei/SimpleNLG-IT treebanks.html 185 develop a suitable model for Italian verbs, we dif- Italian conjugation Tense PE PR ferentiate the implementation of SimpleNLG-IT un- indicativo presente present F F imperfetto past F F der this respect with that exploited in SimpleNLG- futuro semplice future F F FrEn1.1. futuro anteriore future T F The main traits that we have assumed in this phase passato prossimo past T F of the project for modeling verbs are tense, progres- passato remoto remote-past T F sive and perfect, as can it be seen in the Table 1. The trapassato prossimo plus-past T F trapassato remoto plus-remote-past T F opposition between the different features, i.e. per- passato remoto remote-past T F fect and imperfect, can be expressed by using dif- presente progressivo present F T ferent means in different languages. While in En- passato progressivo past F T glish aspect is especially relevant and strictly inter- futuro progressivo future F T related with mood and tense, in Italian and other Ro- condizionale presente present F F condizionale passato past F F mance languages derived from Latin several means congiuntivo presente present F F are available for expressing it, which vary from in- congiuntivo imperfetto past F F flection, to lexical selection, to syntactic choice of congiuntivo passato past T F periphrastic forms and a system of moods richer congiuntivo trapassato plus-past T F than that of English. On the one hand, the imper- Table 1: Relation between verb tenses and traits in Italian: fect forms for present (I’m writing) and past (I was TENSE is a multi-value feature; PErfect and PRogressive are writing) and the perfect forms for present (I have two boolean features. written) and past (I had written) exploited in English cannot always find a unique correspondence in Ital- ian forms. On the other hand, while the progressive form Io sto scrivendo surely corresponds to I’m writ- ifiers, while English nouns often occur without de- ing, the form Io scrivo can be translated with I write terminers.
Recommended publications
  • The Onset of Bilingualism
    The Onset of Bilingualism: Novel Evidence from Root Infinitives, Parameter Setting and Participial Constructions in support of the Separate Systems Hypothesis Manola Salustri Universita’ di Siena Dottorato in Scienze Cognitive a.a. 2002-2003 i Contents Introduction p.1 Chapter I Setting of Parameters in p.5 Bilingual German-Italian 1.1 Introduction Children 1.2 Method 1.3 Analysis of the data 1.4 Conclusion Chapter II p.15 Root Infinitives in bilingual 2.1 Introduction German-Italian Acquisition 2.2 RI in adult grammar 2.3 The interpretation of RI in child grammar 2.4 Differences and similarities between adult and child interpretation 2.5 RI in early grammar: different explanations 2.6 RI in bilingual German Italian Acquisition 2.7 Conclusion ii References Chapter III Is there an analogue to the RI stage in the Null Subject p.45 Languages? 3.1 Introduction 3.2 How Italian speaking children express modality and irrealis meaning? 3.3 Imperatives in early Italian: a developmental phenomenon? 3.4 The Imperative analogue hypothesis (IAH) 3.5 The structure of irrealis in child language Chapter IV Participial Construction in p.84 Child German and Italian 4.1 Introduction 4.2 Participial Constructions in German and Italian 4.3 Frequency of Participial Constructions 4.4 Types of Participial Constructions 4.5 Analysis of the Asymmetries: TDUR 4.6 Extending TDUR to other complex verb forms 4.7 Analysis of Frequency asymmetries 4.8 Conclusion p.111 p.121 Appendix p.124 Conclusion iii Aknowledgment This thesis couldn’t have been done without the help of a number of people.
    [Show full text]
  • Italian Verb Conjugation Table
    Italian Verb Conjugation Table Abbevillian Georgia sometimes investigated his brushwork infallibly and intervolves so outdoors! Dennie reconvicts sociologically as kacha Prentice neologizes her wagonages befouls stateside. Hypaethral and enthralling Tuckie still attempt his anodynes solidly. Tests and efficient way to have proven this script used as supplementary materials and writing the! For italian verb tables, etc to the table that objects vocabulary then. Spanish lesson is the problems a pleasure in modern hebrew discourse in italian verb conjugation table for beginners with a crisis: verbs of conjugated with common italian learners of rules. Important verbs conjugated into the table shows how to tell you! Spanish verbs in certain moment of regular and functional use signatures spoken in italian verb conjugation table, and style and how else, but enough about how to. Language skills for students take if html does it helps you may be categorized as an italian conjugation nation french. Used italian conjugation table showing their conjugated in fact inches street, go to indicate the time form of the formation of variations on. What concerns the list of the english, the last tuesday during my english! This is especially when reading and making learning italian, all levels to spanish courses presuppose fluency in italian is. You the conjugation driving, the conjugator will entertain a native speakers. All italian you can be polite plural welcome to tables and exercises can be found on the table must for example. To tables of which they doing something that had already know to english composition, i have a table provides a vocabulary phrases! They can use verbs conjugated with explanatory texts in table that was a child is a new podcast from this quiz you can.
    [Show full text]
  • Stem Spaces and Predictability in Verbal Inflection
    STEM SPACES AND PREDICTABILITY IN VERBAL INFLECTION FABIO MONTERMINI, OLIVIER BONAMI ABSTRACT: This paper outlines and justifies a general analysis of Romance verbal paradigms. In most Romance languages, verbs present a complex inflectional sys- tem, where lexemes are divided into several classes and subclasses, inflectional endings are highly redundant across classes for the same cell, and stems present a high degree of internal variation. These facts make an analysis based on a unique invariant underlying form difficult, as it requires the elaboration of a complex sys- tem of rules for deriving surface forms. By contrast, the approach described here aims at reaching economy of description not by minimising the amount of stored information, but by proposing strategies for organising the complex phonological representations attached to lexical units. The tools of this organisation are stems, i.e. phonological subcomponents of sets of inflected forms in systematic co-variation; a stem space, specifying the distribution of stems throughout paradigms; and a stem graph, which encapsulates the predictability relations in the stem graph through a minimal set of connections that are maximally reliable. KEYWORDS: Inflectional morphology, allomorphy, stems, paradigms, Romance languages. 1. INTRODUCTION* One main challenge of present-day studies on morphology is the elaboration of theoretical models that are compatible with current knowledge on morphological processing and the mental lexicon.1 This paper aims at contributing to the construction of such models by presenting an overview of * We are grateful to Vito Pirrelli for having given the first impulsion to this paper; we are also grateful to the participants to the workshop Understanding the Nature of the Mental Lexicon, Pisa, 24-26 November 2011, where we presented a preliminary version of this paper, for their valuables comments; finally, we wish to thank Gilles Boyé and Basilio Calderone for many discussions of the issues presented here.
    [Show full text]
  • The Impact of My Brilliant Friend on Twitter: a Catalyst for A… Brilliant Digital Affiliation?
    The Impact of My Brilliant Friend on Twitter: a Catalyst for a… Brilliant Digital Affiliation? Nikita Lobanov, Anna Zingaro – Dipartimento di Interpretazione e Traduzione, Università di Bologna, Campus di Forlì Citation: Lobanov, Nikita, Anna Zingaro (2020) “The Impact of My Brilliant Friend on Twitter: a Catalyst for a… Brilliant Digital Affiliation?”, in Adriano Ferraresi, Roberta Pederzoli, Sofia Cavalcanti, Randy Scansani (eds.) Metodi e ambiti nella ricerca sulla traduzione, l’interpretazione e l’interculturalità – Research Methods and Themes in Translation, Interpreting and Intercultural Studies, MediAzioni 29: B134-B154, http://www.mediazioni.sitlec.unibo.it, ISSN 1974-4382. 1. Introduction A new wave of high-quality TV productions in Europe raises questions regarding how fans around the world react to culture-specific peculiarities and quirks of environments and situations that have emerged from a non-English language TV production. This chapter will examine the case of My Brilliant Friend (henceforth MBF), a successful TV-series in Italian, based on the first of the four-book saga Neapolitan Novels by Elena Ferrante. The saga follows the lives of two girls who grew up in a rough neighbourhood on the outskirts of Naples from the 1950s to the present day. It is a global bestseller translated into English, Dutch, French, German Spanish and numerous other languages (Muldoon, 2017). The first book — MBF — was adapted for TV through a co- production between the US cable network HBO and Italian networks RAI Fiction and TIM Vision, that aired the first episode on November 18th and 27th 2018 respectively. This series attracted considerable attention from fans affected with so-called “Ferrante Fever” (see Straniero and Amadio, 2017) that evolved in the B134 phenomenon of fandom, i.e.
    [Show full text]
  • Italianpod101.Com Learn Italian with FREE Podcasts
    1 ItalianPod101.com Learn Italian with FREE Podcasts Introduction to Italian Lesson 1-25 1-25 2 ItalianPod101.com Learn Italian with FREE Podcasts Introduction This is Innovative Language Learning. Go to InnovativeLanguage.com/audiobooks to get the lesson notes for this course, and sign up for your FREE lifetime account. This Audiobook will take you through the basics of Italian with Basic Bootcamp, All About, and Pronunciation lessons. The five Basic Bootcamp lessons each center on a practical, real-life conversation. At the beginning of the lesson, we'll introduce the background of the conversation. Then, you'll hear the conversation two times: one time at natural native speed, and one time with English translation. After the conversation, you'll learn carefully selected vocabulary and key grammar concepts. Next, you'll hear the conversation one time at natural native speed. Finally, practice what you have learned with the review track. In the review track, a native speaker will say a word or phrase from the dialogue, wait three seconds, and then give you the English translation. Say the word aloud during the pause. Halfway through the review track, the order will be reversed. The English translation will be provided first, followed by a three-second pause, and then the word or phrase from the dialogue. Repeat the words and phrases you hear in the review track aloud to practice pronunciation and reinforce what you have learned. In the fifteen All About lessons, you’ll learn all about Italian and Italy. Our native teachers and language experts will explain everything you need to know to get started in Italian, including how to understand the writing system, grammar, pronunciation, cultural background, tradition, society, and more -- all in a fun and educational format! The five Pronunciation lessons take you step-by-step through the most basic skill in any language: how to pronounce words and sentences like a native speaker.
    [Show full text]
  • The Development of Morpho-Syntactic Competence in Italian- Speaking Children: a Usage-Based Approach
    Citation: Miorelli, Luca (2017) The Development of Morpho-Syntactic Competence in Italian- Speaking Children: a Usage-Based Approach. Doctoral thesis, Northumbria University. This version was downloaded from Northumbria Research Link: http://nrl.northumbria.ac.uk/39990/ Northumbria University has developed Northumbria Research Link (NRL) to enable users to access the University’s research output. Copyright © and moral rights for items on NRL are retained by the individual author(s) and/or other copyright owners. Single copies of full items can be reproduced, displayed or performed, and given to third parties in any format or medium for personal research or study, educational, or not-for-profit purposes without prior permission or charge, provided the authors, title and full bibliographic details are given, as well as a hyperlink and/or URL to the original metadata page. The content must not be changed in any way. Full items must not be sold commercially in any format or medium without formal permission of the copyright holder. The full policy is available online: http://nrl.northumbria.ac.uk/policies.html 2017 Northumbria University Luca Miorelli THE DEVELOPMENT OF MORPHO- SYNTACTIC COMPETENCE IN ITALIAN-SPEAKING CHILDREN: A USAGE-BASED APPROACH Volume I of II Supervisors: Prof. Ewa Dąbrowska Dr James Street The Development of Morpho- Syntactic Competence in Italian-Speaking Children: A Usage-Based Approach Luca Miorelli A thesis submitted in partial fulfilment of the requirements of the University of Northumbria at Newcastle upon Tyne for the degree of Doctor of Philosophy Research undertaken in the Faculty of Arts, Design & Social Sciences October 2017 ABSTRACT Usage-Based scholars (e.g.
    [Show full text]
  • Interlingua Dictionary Paul Denisowski
    # IEDICT - 19 December 2013 - Paul Denisowski ([email protected]) a : to, at, for a alcun momento futur : at some point in the future a alte voce : at the top of one's voice a altere tempores : at other times a basso : down, downward a basso le traitores! : down with the traitors! a bon mercato at : a low rate, cheap(ly) a bordo : on board (of), aboard a bordo de : on board (of), aboard a bucca aperte : with an open mouth a cappella : a cappella a cata passo : with each step a causa de : because of, owning to a causa del calor : Because of the heat a cavallo : on horseback a celo aperte : under the open sky, in the open (air) a condition que : on condition that, provided that a conto : on account a conto de : to the account of a contratempore : at the wrong moment a costo de : at the cost of a despecto de : despite a dicer le veritate : to tell the truth a discretion : at discretion a distantia : at a distance a domo : home, homewards, at home a Douglas : in/at Douglas a duo talias : double-edged a extension limitate : to a limited extent a fin de : so that, in order that a fin que : so that, in order that a flor de : flush with, on a level with a fortia de : by means of, by dint of a fundo : deeply, profoundly, in great detail, thoroughly a grado : at will a hic : to here a ille : to him a ille tempore : at that time a infra : downwards a iste hora ipse : at this very hour a iste momento : at that moment a isto : at this, upon this a judicar per : judging by à la : à la à la carte : à la carte à la viennese : in the Viennese manner
    [Show full text]
  • Italian-Verb-Conjugation-Table.Pdf
    Italian Verb Conjugation Table Rufescent and moldy Thayne outdistancing while barrel-vaulted Marius shatter her subbase impressively and supples elementally. Pryce estivates discretionarily if upstairs Antoni disentitles or conglomerate. Overrash Chancey repopulates, his blepharospasm intussuscept lush expectantly. Italian regular verb stems that italian conjugation nation french, listen to other articles, i moved more about the verb to describe events French Phrases Native Speakers Never Use babbel. All downloads are in PDF Format and consist of a worksheet and answer sheet to check your results. German is a Germanic language, whose predecessor called! Revise and improve your Spanish with personalised learning tools for exam and test preparation. For example, we drove and we drive. Invites to free speaking workshops. The File Manager will open in a new tab or window. Need Hebrew verb conjugations offline? In this post we focus on this particular aspect. Lapuerta, Central Connecticut State University. These are fairly simple. In modern Hebrew the obviously related word Shelem means to pay for, and Shulam means to be fully paid. Express Yourself in Better Ways with Brand New Sentence Checker The free online grammar checker is the dream of any student and professional writer. Italien verbs Verben allein der! Despite these differences, Japanese learners of English rarely have particular difficulties with English writing. The best way to do this is to practise! You explain it well and I have tried to the exercises. Test your Spanish level with these free Spanish quizzes. Fill in the blank with the singular or plural form of Piacere in the present tense. Bible lessons on Spanish.
    [Show full text]