The Workshop Programme

The Workshop Programme Monday, May 26 14:30 -15:00 Opening Nancy Ide and Adam Meyers 15:00 -15:30 SIGANN Shared Corpus Working Group Report Adam Meyers 15:30 -16:00 Discussion: SIGANN Shared Corpus Task 16:00 -16:30 Coffee break 16:30 -17:00 Towards Best Practices for Linguistic Annotation Nancy Ide, Sameer Pradhan, Keith Suderman 17:00 -18:00 Discussion: Annotation Best Practices 18:00 -19:00 Open Discussion and SIGANN Planning Tuesday, May 27 09:00 -09:10 Opening Nancy Ide and Adam Meyers 09:10 -09:50 From structure to interpretation: A double-layered annotation for event factuality Roser Saurí and James Pustejovsky 09:50 -10:30 An Extensible Compositional Semantics for Temporal Annotation Harry Bunt, Chwhynny Overbeeke 10:30 -11:00 Coffee break 11:00 -11:40 Using Treebank, Dictionaries and GLARF to Improve NomBank Annotation Adam Meyers 11:40 -12:20 A Dictionary-based Model for Morpho-Syntactic Annotation Cvetana Krstev, Svetla Koeva, Du!ko Vitas 12:20 -12:40 Multiple Purpose Annotation using SLAT - Segment and Link-based Annotation Tool (DEMO) Masaki Noguchi, Kenta Miyoshi, Takenobu Tokunaga, Ryu Iida, Mamoru Komachi, Kentaro Inui 12:40 -14:30 Lunch 14:30 -15:10 Using inheritance and coreness sets to improve a verb lexicon harvested from FrameNet Mark McConville and Myroslava O. Dzikovska 15:10 -15:50 An Entailment-based Approach to Semantic Role Annotation Voula Gotsoulia 16:00 -16:30 Coffee break 16:30 -16:50 A French Corpus Annotated for Multiword Expressions with Adverbial Function Eric Laporte, Takuya Nakamura, Stavroula Voyatzi 16:50 -17:20 On Construction of Polish Spoken Dialogs Corpus Agnieszka Mykowiecka, Krzysztof Marasek, Ma"gorzata Marciniak, Joanna Rabiega-Wisniewska, Ryszard Gubrynowicz 17:20 -17:40 A RESTful interface to Annotations on the Web Steve Cassidy 17:40 -18:30 Panel : Next Challenges in Annotation Theory and Practice 18:30 -19:00 Open Discussion ! "! Workshop Organisers Nancy Ide, Vassar College (co-chair) Adam Meyers, New York University (co-chair) Inderjeet Mani, MITRE Corporation Antonio Pareja-Lora, SIC, UCM / OEG, UPM Sameer Pradhan, BBN Technologies Manfred Stede, Universitat Potsdam Nianwen Xue, University of Colorado ! ""! Workshop Programme Committee David Ahn, Powerset Lars Ahrenberg, Linkoping University Timothy Baldwin, University of Melbourne Francis Bond, NICT Kalina Bontcheva, University of Sheffield Matthias Buch-Kromann, Copenhagen Business School Paul Buitelaar, DFKI Jean Carletta, University of Edinburgh Christopher Cieri, Linguistic Data Consortium/University of Pennsylvania Hamish Cunningham, University of Sheffield David Day, MITRE Corporation Thierry Declerck, DFKI Ludovic Denoyer, University of Paris 6 Richard Eckart, Darmstadt University of Technology Tomaz Erjavec, Jozef Stefan Institute David Farwell, New Mexico State University Alex Chengyu Fang, City University Hong Kong Chuck Fillmore, International Computer Science Institute John Fry, San Jose State University Claire Grover, University of Edinburgh Eduard Hovy, Information Sciences Institute Baden Hughes, University of Melbourne Emi Izumi, NICT Aravind Joshi, University of Pennsylvania Ewan Klein, University of Edinburgh Mike Maxwell, University of Maryland Stephan Oepen, University of Oslo Martha Palmer, University of Colorado Manfred Pinkal, Saarland University James Pustejovsky, Brandeis University Owen Rambow, Columbia University Laurent Romary, Max-Planck Digital Library Erik Tjong Kim Sang, University of Amsterdam Graham Wilcock, University of Helsinki Theresa Wilson, University of Edinburgh ! """! Table of Contents Long papers From structure to interpretation: A double-layered annotation for event factuality Roser Saurí, James Pustejovsky..............................................................................................1 An Extensible Compositional Semantics for Temporal Annotation Harry Bunt, Chwhynny Overbeeke .........................................................................................9 Using Treebank, Dictionaries and GLARF to Improve NomBank Annotation Adam Meyers.........................................................................................................................17 A Dictionary-based Model for Morpho-Syntactic Annotation Cvetana Krstev, Svetla Koeva, Du!ko Vitas .........................................................................25 Using inheritance and coreness sets to improve a verb lexicon harvested from FrameNet Mark McConville, Myroslava O. Dzikovska.........................................................................33 An Entailment-based Approach to Semantic Role Annotation Voula Gotsoulia ....................................................................................................................41 Short papers A French Corpus Annotated for Multiword Expressions with Adverbial Function Eric Laporte, Takuya Nakamura, Stavroula Voyatzi............................................................48 On Construction of Polish Spoken Dialogs Corpus Agnieszka Mykowiecka, Krzysztof Marasek, Ma"gorzata Marciniak, Joanna Rabiega-Wisniewska, Ryszard Gubrynowicz........................................................................52 A RESTful interface to Annotations on the Web Steve Cassidy.........................................................................................................................56 Demonstration Multiple Purpose Annotation using SLAT - Segment and Link-based Annotation Tool Masaki Noguchi, Kenta Miyoshi, Takenobu Tokunaga, Ryu Iida, Mamoru Komachi, Kentaro Inui..........................................................................................................61 ! "#! Author Index Bunt, Harry 9 Cassidy, Steve 56 Dzikovska, Myroslava O. 33 Gotsoulia, Voula 41 Gubrynowicz, Ryszard 52 Iida, Ryu 61 Inui, Kentaro 61 Koeva, Svetla 25 Komachi, Mamoru 61 Krstev, Cvetana 25 Laporte, Eric 48 Marasek, Krzysztof 52 Marciniak, Ma!gorzata 52 McConville, Mark 33 Meyers, Adam 17 Miyoshi, Kenta 61 Mykowiecka, Agnieszka 52 Nakamura, Takuya 48 Noguchi, Masaki 61 Overbeeke, Chwhynny 9 Pustejovsky, James 1 Rabiega-Wisniewska, Joanna 52 Saurí, Roser 1 Tokunaga, Takenobu 61 Vitas, Du"ko 25 Voyatzi, Stavroula 48 ! ! #! From structure to interpretation: A double-layered annotation for event factuality Roser Saur´ı and James Pustejovsky Lab for Linguistics and Computation Computer Science Department Brandeis University roser,jamesp @cs.brandeis.edu { } Abstract Current work from different areas in the field points out the need for systems to be sensitive to the factuality nature of events mentioned in text; that is, to recognize whether events are presented as corresponding to real situations in the world, situations that have not happened, or situations of uncertain status. Event factuality is a necessary component for representing events in discourse, but for annotation purposes it poses a representational challenge because it is expressed through the interaction of a varied set of structural markers. Part of these factuality markers is already encoded in some of the existing corpora but always in a partial way; that is, missing an underlying model that is capable of representing the factuality value resulting from their interaction. In this paper, we present FactBank, a corpus of events annotated with factuality information which has been built on top of TimeBank. Together, TimeBank and FactBank offer a double-layered annotation of event factuality: where TimeBank encodes most of the basic structural elements expressing factuality information, FactBank adds a representation of the resulting factuality interpretation. 1. Introduction entailment, it has been taken as a basic feature in some of In the past decade, most efforts towards corpus construc- the systems participating in (or using the data from) previ- tion have been devoted to encoding a variety of seman- ous PASCAL RTE challenges. tic information structures. For example, much work has For example, Tatu & Moldovan (2005) treat intensional gone to annotating the basic units that configure proposi- contexts, de Marneffe et al. (2006) look at features ac- tions (PropBank, FrameNet) and the relations these hold counting for the presence of polarity, modality, and factivity at the discourse level (RST Corpus, Penn Discourse Tree- markers in the textual fragments, while Snow & Vander- Bank, GraphBank), as well as specific knowledge that has wende (2006) check for polarity and modality scoping over proved fundamental in tasks requiring some degree of text matching nodes in a graph. Most significantly, the system understanding, such as temporal information (TimeBank) that obtained the best absolute result in the three RTE chal- and opinion expressions (MPQA Opinion Corpus).1 lenges, scoring an 80% accuracy (Hickl & Bensley, 2007), The field is moving now towards finding platforms for uni- is based on identifying the set of publicly-expressed beliefs fying them in an optimal way –e.g., Pradhan et al. (2007); of the author; that is, on the author’s commitments of how Verhagen et al. (2007). It therefore seems we are at a things are in the world according to what is expressed in point where the first elements for text understanding can text –either asserted, presupposed, or implicated. be brought together. Event factuality is a necessary component for representing Nonetheless, current work from different areas in the field events in discourse, together with other levels of informa- points out the need for systems to be sensitive to an addi- tion such as argument structure or temporal information.

The Workshop Programme

Talk Bank: a Multimodal Database of Communicative Interaction

Adapting to Trends in Language Resource Development: a Progress

The Main Features of the E-Glava Online Valency Dictionary

Characteristics of Text-To-Speech and Other Corpora

100000 Podcasts

Syntactic Annotation of Spoken Utterances: a Case Study on the Czech Academic Corpus

Multilingvální Systémy Rozpoznávání Řeči a Jejich Efektivní Učení

Named Entity Recognition in Question Answering of Speech Data

Reducing Out-Of-Vocabulary in Morphology to Improve the Accuracy in Arabic Dialects Speech Recognition

Here ACL Makes a Return to Asia!

Detection of Imperative and Declarative Question-Answer Pairs in Email Conversations

Multimedia Corpora (Media Encoding and Annotation) (Thomas Schmidt, Kjell Elenius, Paul Trilsbeek)