Distinguishing Past, On-Going, and Future Events: the Eventstatus Corpus

Total Page:16

File Type:pdf, Size:1020Kb

Distinguishing Past, On-Going, and Future Events: the Eventstatus Corpus Distinguishing Past, On-going, and Future Events: The EventStatus Corpus Ruihong Huang Ignacio Cases Dan Jurafsky Texas A&M University Stanford University Stanford University [email protected] [email protected] [email protected] Cleo Condoravdi Ellen Riloff Stanford University University of Utah [email protected] [email protected] Abstract as protests, demonstrations, marches, and strikes, in which each event is annotated as PAST, ON-GOING, Determining whether a major societal event or FUTURE (sublabeled as PLANNED, ALERT or has already happened, is still on-going, or may POSSIBLE). This task bridges event extraction re- occur in the future is crucial for event pre- search and temporal research in the tradition of diction, timeline generation, and news sum- TIMEBANK (Pustejovsky et al., 2003) and TempE- marization. We introduce a new task and a new corpus, EventStatus, which has 4500 En- val (Verhagen et al., 2007; Verhagen et al., 2010; glish and Spanish articles about civil unrest UzZaman et al., 2013). Previous corpora have be- events labeled as PAST, ON-GOING, or FU- gun this association: TIMEBANK, for example, in- TURE. We show that the temporal status of cludes temporal relations linking events with Doc- these events is difficult to classify because lo- ument Creation Times (DCT). But the EventStatus cal tense and aspect cues are often lacking, task and corpus offers several new research direc- time expressions are insufficient, and the lin- tions. guistic contexts have rich semantic composi- tionality. We explore two approaches for event First, major societal events are often discussed be- status classification: (1) a feature-based SVM fore they happen, or while they are still happening, classifier augmented with a novel induced lex- because they have the potential to impact a large icon of future-oriented verbs, such as “threat- number of people. News outlets frequently report ened” and “planned”, and (2) a convolutional on impending natural disasters (e.g., hurricanes), an- neural net. Both types of classifiers improve ticipated disease outbreaks (e.g., Zika virus), threats event status recognition over a state-of-the-art of terrorism, and plans or warnings of potential civil TempEval model, and our analysis offers lin- guistic insights into the semantic composition- unrest (e.g., strikes and protests). Traditional event ality challenges for this new task. extraction research has focused primarily on recog- nizing events that have already happened. Further- more, the linguistic contexts of on-going and future 1 Introduction events involve complex compositionality, and fea- tures like explicit time expressions are less useful. When a major societal event is mentioned in the Our results demonstrate that a state-of-the-art Tem- news (e.g., civil unrest, terrorism, natural disaster), it pEval system has difficulty identifying on-going and is important to understand whether the event has al- future events, mislabeling examples like these: ready happened (PAST), is currently happening (ON- (1) The metro workers’ strike in Bucharest has entered GOING), or may happen in the future (FUTURE). We the fifth day. (On-Going) introduce a new task and corpus for studying the (2) BBC unions demand more talks amid threat of new temporal/aspectual properties of major events. The strikes. (Future) EventStatus corpus consists of 4500 English and (3) Pro-reform groups have called for nationwide Spanish news articles about civil unrest events, such protests on polling day. (Future) Second, we intentionally created the EventSta- lexicon of 411 English and 348 Spanish “future- tus corpus to concentrate on one particular event oriented” matrix verbs—verbs like “threaten” and frame (class of events): civil unrest. In contrast, “fear” whose complement clause or nominal direct previous temporally annotated corpora focus on a object argument is likely to describe a future event. wide variety of events. Focusing on one frame (se- We show that the SVM outperforms a state-of-the- mantic depth instead of breadth) makes this corpus art TempEval system and that the induced lexicon analogous to domain-specific event extraction data further improves performance for both English and sets, and therefore appropriate for evaluating rich Spanish. We also introduce a Convolutional Neu- tasks like event extraction and temporal question an- ral Network (CNN) to detect the temporal status of swering, which require more knowledge about event events. Our analysis shows that it successfully mod- frames and schemata than might be represented in els semantic compositionality for some challenging large broad corpora like TIMEBANK (UzZaman et temporal contexts. The CNN model again improves al., 2012; Llorens et al., 2015). performance in both English and Spanish, providing Third, the EventStatus corpus focuses on specific strong initial results for this new task and corpus. instances of high-level events, in contrast to the low- level and often non-specific or generic events that 2 The EventStatus Corpus 1 dominate other temporal datasets. Mentions of spe- For major societal events, it can be very impor- cific events are much more likely to be realized in tant to know whether the event has ended or if it non-finite form (as nouns or infinitives, such as “the is still in progress (e.g., are people still rioting in strike” or “to protest”) than randomly selected event the streets?). And sometimes events are anticipated keywords. In breadth-based corpora like the Event- before they actually happen, such as labor strikes, CorefBank (ECB) corpus (Bejan and Harabagiu, marches and parades, social demonstrations, politi- 2008), 34% of the events have non-finite realization; cal events (e.g., debates and elections), and acts of in TIMEBANK, 45% of the events have non-finite war. The EventStatus corpus represents the tempo- realization. By contrast, in a frame-based corpus ral status of an event as one of five categories: like ACE2005 (ACE, 2005), 59% of the events have non-finite forms. In the EventStatus corpus, 80% of Past: An event that has started and has ended. There the events have non-finite forms. Whether this is due should be no reason to believe that it may still be in to differences in labeling or to intrinsic properties of progress. these events, the result is that they are much harder On-going: An event that has started and is still in to label because tense and aspect are less available progress or likely to resume2 in the immediate fu- than for events realized as finite verbs. ture. There should be no reason to believe that it has Fourth, the EventStatus data set is multilingual: ended. we collected data from both English and Spanish Future Planned: An event that has not yet started, texts, allowing us to compare events representing but a person or group has planned for or explicitly the same event frame across two languages that are committed to an instance of the event in the future. known to differ in their typological properties for de- There should be near certainty it will happen. scribing events (Talmy, 1985). Future Alert: An event that has not yet started, but Using the new EventStatus corpus, we investigate a person or group has been threatening, warning, or two approaches for recognizing the temporal status advocating for a future instance of the event. of events. We create a SVM classifier that incor- Future Possible: An event that has not yet started, porates features drawn from prior TempEval work but the context suggests that its occurrence is a live (Bethard, 2013; Chambers et al., 2014; Llorens et possibility (e.g., it is anticipated, feared, hinted at, al., 2010) as well as a new automatically induced or is mentioned conditionally). 1 For example in TIMEBANK almost half the annotated The three subtypes of future events are important events (3720 of 7935) are hypothetical or generic, i.e., PERCEP- TION, REPORTING, ASPECTUAL, I ACTION, STATE or I STATE 2For example, demonstrators have gone home for the day rather than the specific OCCURRENCE. but are expected to return in the morning. Past [EN] Today’s demonstration ended without violence. An estimated 2,000 people protested against the government in Peru. [SP] Termino´ la manifestacion´ de los kurdos en la UNESCO de Par´ıs. On-going [EN] Negotiations continue with no end in sight for the 2 week old strike. Yesterday’s rallies have caused police to fear more today. [SP] Pacifistas latinoamericanos no cesan sus protestas contra guerra en Irak. Future Planned [EN] 77 percent of German steelworkers voted to strike to raise their wages. Peace groups have already started organizing mass protests in Sydney. [SP] Miedo en la City en v´ıspera de masivas protestas que la toman por blanco. Future Alert [EN] Farmers have threatened to hold demonstrations on Monday. Nurses are warning they intend to walkout if conditions don’t improve. [SP] Indigenas hondurenos˜ amenazan con declararse en huelga de hambre. Future Possible [EN] Residents fear riots if the policeman who killed the boy is acquitted. The military is preparing for possible protests at the G8 summit. [SP] Polic´ıa Militar analiza la posibilidad de decretar una huelga nacional. Table 1: Examples of event status categories for civil unrest events, showing two examples in English [EN] and one in Spanish [SP]. in marking not just temporal status but also what we English words3 and 13 Spanish words4 and phrases might call predictive status. Events very likely to oc- associated with civil unrest events, and added their cur are distinguished from events whose occurrence morphological variants. We then randomly selected depends on other contingencies (Future Planned vs. 2954 and 14915 news stories from the English Gi- Alert/Possible). Warnings or mentions of a potential gaword 5th Ed. (Parker et al., 2011) and Spanish event by a likely actor are further distinguished from Gigaword 3rd Ed.
Recommended publications
  • The Structure of Plays
    n the previous chapters, you explored activities preparing you to inter- I pret and develop a role from a playwright’s script. You used imagina- tion, concentration, observation, sensory recall, and movement to become aware of your personal resources. You used vocal exercises to prepare your voice for creative vocal expression. Improvisation and characterization activities provided opportunities for you to explore simple character portrayal and plot development. All of these activities were preparatory techniques for acting. Now you are ready to bring a character from the written page to the stage. The Structure of Plays LESSON OBJECTIVES ◆ Understand the dramatic structure of a play. 1 ◆ Recognize several types of plays. ◆ Understand how a play is organized. Much of an actor’s time is spent working from materials written by playwrights. You have probably read plays in your language arts classes. Thus, you probably already know that a play is a story written in dia- s a class, play a short logue form to be acted out by actors before a live audience as if it were A game of charades. Use the titles of plays and musicals or real life. the names of famous actors. Other forms of literature, such as short stories and novels, are writ- ten in prose form and are not intended to be acted out. Poetry also dif- fers from plays in that poetry is arranged in lines and verses and is not written to be performed. ■■■■■■■■■■■■■■■■ These students are bringing literature to life in much the same way that Aristotle first described drama over 2,000 years ago.
    [Show full text]
  • Topic: Sentence Structure Definition: Syntax, Diction, and Grammar Create
    Topic: Sentence Structure Definition: Syntax, diction, and grammar create sentence style. They work together to create effective writing. Explanation: Good writing is clear and easily understood by readers. That means the following elements of writing must be used effectively: • Syntax (sentence structure) should be as concise as possible. • Diction (word choice) should be as precise as possible. • Grammar (spelling and punctuation) should be accurate. Examples of rules in style to remember • Use simple present tense consistently when describing someone's written statement. o Example: “In another article she is saying…” or “she said…” o Better: "In another article she says…" o Note on verb tense: Research papers are typically written in past tense, and papers about literature and film are written in present tense. • To maintain formal style, avoid contractions even though they are appropriate in conversation. o Example: "I've noticed that she's written a novel that many people haven't read." o Better: "I have noticed that she has written a novel that many people have not read." • To maintain formal style, avoid writing in second person (you, your). o Example: "When you read Lincoln's words, you will understand his wisdom." o Better: "Reading Lincoln's words reveals his wisdom." • Avoid qualifiers, such as, pretty, little, rather, somewhat, very, kind of. They sound uncertain and hesitant. Go easy on the use of very. o Example: "I am pretty certain that with a little effort, you can be somewhat sure of the kind of success that most people usually hope for." o Better: "With effort, she can reach the success others hope for." Page 1 of 3 • Go easy on superlatives.
    [Show full text]
  • Poetic Diction, Poetic Discourse and the Poetic Register
    proceedings of the British Academy, 93.21-93 Poetic Diction, Poetic Discourse and the Poetic Register R. G. G. COLEMAN Summary. A number of distinctive characteristics can be iden- tified in the language used by Latin poets. To start with the lexicon, most of the words commonly cited as instances of poetic diction - ensis; fessus, meare, de, -que. -que etc. - are demonstrably archaic, having been displaced in the prose register. Archaic too are certain grammatical forms found in poetry - e.g. auldi, gen. pl. superum, agier, conticuere - and syntactic constructions like the use of simple cases for pre- I.positional phrases and of infinitives instead of the clausal structures of classical prose. Poets in all languages exploit the linguistic resources of past as well as present, but this facility is especially prominent where, as in Latin, the genre traditions positively encouraged imitatio. Some of the syntactic character- istics are influenced wholly or partly by Greek, as are other ingredients of the poetic register. The classical quantitative metres, derived from Greek, dictated the rhythmic pattern of the Latin words. Greek loan words and especially proper names - Chaoniae, Corydon, Pyrrha, Tempe, Theseus, Zephym etc. -brought exotic tones to the aural texture, often enhanced by Greek case forms. They also brought an allusive richness to their contexts. However, the most impressive charac- teristics after the metre were not dependent on foreign intrusion: the creation of imagery, often as an essential feature of a poetic argument, and the tropes of semantic transfer - metaphor, metonymy, synecdoche - were frequently deployed through common words. In fact no words were too prosaic to appear in even the highest poetic contexts, always assuming their metricality.
    [Show full text]
  • ELEMENTS of FICTION – NARRATOR / NARRATIVE VOICE Fundamental Literary Terms That Indentify Components of Narratives “Fiction
    Dr. Hallett ELEMENTS OF FICTION – NARRATOR / NARRATIVE VOICE Fundamental Literary Terms that Indentify Components of Narratives “Fiction” is defined as any imaginative re-creation of life in prose narrative form. All fiction is a falsehood of sorts because it relates events that never actually happened to people (characters) who never existed, at least not in the manner portrayed in the stories. However, fiction writers aim at creating “legitimate untruths,” since they seek to demonstrate meaningful insights into the human condition. Therefore, fiction is “untrue” in the absolute sense, but true in the universal sense. Critical Thinking – analysis of any work of literature – requires a thorough investigation of the “who, where, when, what, why, etc.” of the work. Narrator / Narrative Voice Guiding Question: Who is telling the story? …What is the … Narrative Point of View is the perspective from which the events in the story are observed and recounted. To determine the point of view, identify who is telling the story, that is, the viewer through whose eyes the readers see the action (the narrator). Consider these aspects: A. Pronoun p-o-v: First (I, We)/Second (You)/Third Person narrator (He, She, It, They] B. Narrator’s degree of Omniscience [Full, Limited, Partial, None]* C. Narrator’s degree of Objectivity [Complete, None, Some (Editorial?), Ironic]* D. Narrator’s “Un/Reliability” * The Third Person (therefore, apparently Objective) Totally Omniscient (fly-on-the-wall) Narrator is the classic narrative point of view through which a disembodied narrative voice (not that of a participant in the events) knows everything (omniscient) recounts the events, introduces the characters, reports dialogue and thoughts, and all details.
    [Show full text]
  • Creative Writing - Essential Curriculum
    Creative Writing - Essential Curriculum CW11.10 – The student will demonstrate the ability to respond to a text by employing personal experience and critical analysis. • CW11.10.01 – Analyze how the patterns of organization, hierarchic structure, repetition of key ideas, syntax and word choice in a text influence understanding and reveal an author’s purpose. CW11.10.01a – Use scansion to analyze the meter of a poem. • CW11.10.02 – Analyze and evaluate how the literary elements of point of view, tone, imagery, voice, metaphor, and irony are used in texts for literary aesthetics. CW11.20 – The student will demonstrate the ability to compose in a variety of modes by developing content, employing specific forms, and selecting language appropriate for a particular audience. • CW11.20.01 – Write independently for an extended period of time. • CW11.20.02 – Produce literary writing and demonstrate an awareness of audience, purpose, and form using stages of the writing process as needed. • CW11.20.03 – Utilize effective structural elements of particular literary forms (e.g. poetry, short story, novel, drama, essay, biography, autobiography, journalistic writing, and film) to create works of literary merit. • CW11.20.04 – Use literary elements to compose poetry and fiction for literary aesthetics. CW11.20.04a – Use poetic elements to include consonance, assonance (and other forms of rhyme), meter (rhythm), simile, metaphor, hyperbole, understatement, personification, imagery, symbolism, syntax, parallelism, repetition, speaker’s voice, and irony. CW11.20.04b – Use poetic forms to include sonnet, haiku, ballad, blank verse, couplets, quatrains, sestets and at least 2 additional forms (ex – cinquains, villanelle, sestina, etc.).
    [Show full text]
  • Connotation, Diction, Figurative Language, Imagery, Irony, and Theme
    Tone: Connotation, Diction, Figurative Language, Imagery, Irony, and Theme What Should I Learn In This Lesson? ● I can explain the relationship of tone, connotation, and diction in a text. (RL/RI 4). ● I can choose an appropriate tone word for a passage and support it with individual diction choices. (Rl/RI4) ● I can explain the relationship between imagery and figurative language and describe its impact on the tone of a text. (Rl/RI4) ● I can explain why a situational or verbally ironic tone is present when its presence is identified for me. (RL/RI4) ● I can explain how an author uses diction choices and the tone they create to support the development of a theme category or a central idea. (RL/RI 2) What Is Tone? Tone is the general character, attitude, or feeling of a piece of art or literature. What tone does this painting have? How do you know? Tone in Our Bodies: Voice and Body Language Tone is also communicated through our voice and body language. Have you ever had an authority figure tell you to “Watch your tone”? What did he or she mean? Deliver the line “I thought you would understand,” and give the line a strong tone (or feeling). Record your line on the Flipgrid and say your tone word after your performance. If you aren’t sure which tone to use, try one of these: Bitter, Loving, Confused, Surprised View some of the performances of your colleagues. Then answer the following questions. How does voice communicate tone? Consider things like volume, pitch, and speed.
    [Show full text]
  • On the Epistemology of Narrative Theory : Narratology and Other Theories of Fictional Narrative Sylvie Patron
    On the Epistemology of Narrative Theory : Narratology and Other Theories of Fictional Narrative Sylvie Patron To cite this version: Sylvie Patron. On the Epistemology of Narrative Theory : Narratology and Other Theories of Fictional Narrative. Robert Kawashima, Gilles Philippe et Thelma Sowley. Phantom Sentences. Essays in Linguistics and Literature presented to Ann Banfield, Berne, Peter Lang, pp. 43-65, 2008. hal- 00698699v2 HAL Id: hal-00698699 https://hal.archives-ouvertes.fr/hal-00698699v2 Submitted on 28 Mar 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. On the Epistemology of Narrative Theory: Narratology and Other Theories of Fictional Narrative Sylvie Patron University of Paris 7-Denis Diderot (Translated by Anne Marsella) Introduction The work of Gérard Genette in the field referred to as “narratology”2 represents one of the most important contributions to narrative theory, considered as a branch of literary theory, in the second half of the twentieth century. I purposely say “one of the most important”, as there are other theoretical contributions, some of which I believe to be equally important though they are not as well known as Genette’s narratology, particularly in France.3 These lesser-known theories are rich in epis- temological reflection.
    [Show full text]
  • Theory and Interpretation of Narrative) Includes Bibliographical References and Index
    Theory and In T e r p r e Tati o n o f n a r r ati v e James Phelan and Peter J. rabinowitz, series editors Postclassical Narratology Approaches and Analyses edited by JaN alber aNd MoNika FluderNik T h e O h i O S T a T e U n i v e r S i T y P r e ss / C O l U m b us Copyright © 2010 by The Ohio State University. All rights reserved Library of Congress Cataloging-in-Publication Data Postclassical narratology : approaches and analyses / edited by Jan Alber and Monika Fludernik. p. cm. — (Theory and interpretation of narrative) Includes bibliographical references and index. ISBN-13: 978-0-8142-5175-1 (pbk. : alk. paper) ISBN-10: 0-8142-5175-7 (pbk. : alk. paper) ISBN-13: 978-0-8142-1142-7 (cloth : alk. paper) ISBN-10: 0-8142-1142-9 (cloth : alk. paper) [etc.] 1. Narration (Rhetoric) I. Alber, Jan, 1973– II. Fludernik, Monika. III. Series: Theory and interpretation of narrative series. PN212.P67 2010 808—dc22 2010009305 This book is available in the following editions: Cloth (ISBN 978-0-8142-1142-7) Paper (ISBN 978-0-8142-5175-1) CD-ROM (ISBN 978-0-8142-9241-9) Cover design by Laurence J. Nozik Type set in Adobe Sabon Printed by Thomson-Shore, Inc. The paper used in this publication meets the minimum requirements of the American National Standard for Information Sciences—Permanence of Paper for Printed Library Materials. ANSI Z39.48-1992. 9 8 7 6 5 4 3 2 1 Contents Acknowledgments vii Introduction Jan alber and monika Fludernik 1 Part i.
    [Show full text]
  • GMAT-Sentence-Correction.Pdf
    A Treasure Trove of Tools and Techniques to Help You Conquer “GMAT Verbal” This eBook presents Chapter 2: Sentence Correction, as excerpted from the parent eBook Chili Hot GMAT: Verbal Review. Maven Publishing mavenpublishing.com © 2011 by Brandon Royal All rights reserved. No part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical — including photocopying, recording or any information storage and retrieval system — without permission in writing from the author or publisher. Published by: Maven Publishing 4520 Manilla Road, Calgary, Alberta, Canada T2G 4B7 www.mavenpublishing.com Correspondence Address in Asia: GPO Box 440 Central, Hong Kong ISBN 978-1-897393-93-2 eDoc This eDoc contains Chapter 2: Sentence Correction, as excerpted from the parent eDoc Chili Hot GMAT: Verbal Review. Technical Credits: Cover design: George Foster, Fairfield, Iowa, USA Editing: Victory Crayne, Laguna Woods, California, USA GMAT® is a registered trademark of the Graduate Management Admission Council, which neither sponsors nor endorses this product. CONTENTS Topical Checklist 7 Chapter 1: The GMAT Exam 9 What’s on the GMAT Exam? w How is the GMAT Scored? w How does the CAT Work? w Exam Tactics w Attitude and Mental Outlook w Time frame for GMAT Study Chapter 2: Sentence Correction 15 Overview Official Exam Instructions for Sentence Correctionw Strategies and Approaches Review of Sentence Correction Overview w The 100-Question Quiz on Grammar, Diction, and Idioms w Review of Grammatical Terms w Diction Review
    [Show full text]
  • English 10 Overview
    Unit 10.4: Stories of Other Worlds: Science Fiction, Fantasy, and Imaginative Literature The final quarter of the year gives students Enduring Understandings opportunities to let their minds roam free to distant • Science fiction and fantasy combine realistic or imagined worlds, as they explore texts that and fantastic elements to examine and pose questions about our world. intentionally create new settings in order to • Authors work within and against the conventions of established genres to achieve comment on the authors’ own societies. Whether the different purposes. stories question the benefits and dangers of new • Effective storytellers control multiple narrative elements to capture readers’ technology or criticize social practices through imaginations. • Willing suspension of disbelief allows subtle, allegorical narratives, they offer a unique readers of fantastical literature to consider provocative philosophies. perspective on the real world by creating one that mirrors it. Each generation’s anxieties, from the nuclear threat of the postwar era to the more recent rise of electronic networks that simultaneously connect and isolate individuals in a Essential Questions worldwide community, find new expression • Why do some authors choose to through the varied types of stories that writers tell. communicate ideas through imaginative fiction? Students analyze the methods these authors use to • How do writers create worlds that are imaginary yet believable? achieve their purposes, simultaneously questioning • How do writers break new ground within the assumptions that they make. In addition, they established genres? practice creating their own imagined societies in • What makes some stories more engaging than others? order to make a strong statement about the one they live in.
    [Show full text]
  • Analyzing Diction from the AP Vertical Teams Guide for English by the College Board
    Analyzing Diction From The AP Vertical Teams Guide for English by the College Board In all forms of literature—nonfiction, fiction, poetry, and drama—authors choose particular words to convey effect and meaning to the reader. Writers employ diction, or word choice, to communicate ideas and impressions, to evoke emotions, and to convey their views of truth to the reader. The following definitions may be useful in helping you understand and appreciate the deliberate word choices that writers make. LEVELS OF DICTION High or formal: free of slangs, idioms, colloquialisms, and contractions. It often contains polysyllabic words, sophisticated syntax, and elegant word choice. Discerning the impracticable state of the poor culprit’s mind, the elder clergyman, who had carefully prepared himself for the occasion, addressed to the multitude a discourse on sin, in all its branches, but with continual reference to the ignominious letter. So forcibly did he dwell upon this symbol, for the hour or more during which his periods were rolling over the people’s heads, that it assumed new terrors in their imagination, and seemed to derive its scarlet hue from the flames of the infernal pit. from The Scarlet Letter by Hawthorne Neutral: standard language and vocabulary without elaborate words and may include contractions The shark swung over and the old man saw his eye was not alive and then he swung over once again, wrapping himself in two loops of the rope. The old man knew that he was dead but the shark would not accept it. Then, on his back, with his tail lashing and his jaws clicking, the shark plowed over the water as a speedboat does.
    [Show full text]
  • Generating Text with Correct Verb Conjugation: Proposal for a New Automatic Conjugator with Nooj
    Generating Text with Correct Verb Conjugation: Proposal for a New Automatic Conjugator with NooJ Héla Fehri, Sondes Dardour MIRACL Laboratory, University of Sfax [email protected], [email protected] processed. They can be in different forms such as a Abstract website or mobile application. The aim of this paper is to generate a text with This paper describes a system that generates well-conjugated verbs. To reach this objective, we texts with correct verb conjugation. The pro- propose to develop a system that allows parsing a posed system integrates a conjugator devel- text, extracting different infinitive forms of verbs oped using a linguistic approach. This latter is and conjugate them in the appropriate tense. This based on dictionaries and transducers built system integrates a conjugator, which makes it pos- with the NooJ linguistic platform. The conju- sible to conjugate Arabic, French, and English verbs gator treats three languages: Arabic, French and English. It recognizes all verbs and allows in the desired tense. This conjugator should guaran- their conjugation in different tenses. The re- tee the correct conjugation of verbs without errors. sults obtained are satisfactory and can easily be In this paper, after an introduction to the pro- improved upon by processing other forms, posed method, we describe our resource construc- such as the negative. tion and implementation using the NooJ linguistic platform (Silberztein and Tutin, 2005). Then, we 1 Introduction give an idea of the experimentation and the results obtained and conclude with some future perspec- Automatic Language Processing is an area of multi- tives. disciplinary research that permits the collaboration of linguists, computer scientists, logicians, psy- 2 Proposed Method chologists, documentalists, lexicographers, and translators.
    [Show full text]