Proceedings of the 13Th Conference of the European Chapter of the Association for Computational Linguistics, Pages 1–5, Avignon, France, April 23 - 27 2012
Total Page:16
File Type:pdf, Size:1020Kb
EACL 2012 Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics April 23 - 27 2012 Avignon France c 2012 The Association for Computational Linguistics ISBN 978-1-937284-19-0 Order copies of this and other ACL proceedings from: Association for Computational Linguistics (ACL) 209 N. Eighth Street Stroudsburg, PA 18360 USA Tel: +1-570-476-8006 Fax: +1-570-476-0860 [email protected] ii Organizers: Fred´ erique´ Segond, XEROX iii Table of Contents Language Resources Factory: case study on the acquisition of Translation Memories Marc Poch, Antonio Toral and Nuria´ Bel . 1 Harnessing NLP Techniques in the Processes of Multilingual Content Management Anelia Belogay, Diman Karagyozov, Svetla Koeva, Cristina Vertan, Adam Przepiorkowski,´ Dan Cristea and Plovios Raxis . 6 Collaborative Machine Translation Service for Scientific texts Patrik Lambert, Jean Senellart, Laurent Romary, Holger Schwenk, Florian Zipser, Patrice Lopez and Fred´ ericBlain...........................................................................´ 11 TransAhead: A Writing Assistant for CAT and CALL Chung-chi Huang, Ping-che Yang, Mei-hua Chen, Hung-ting Hsieh, Ting-hui Kao and Jason S. Chang...................................................................................... 16 SWAN - Scientific Writing AssistaNt. A Tool for Helping Scholars to Write Reader-Friendly Manuscripts Tomi Kinnunen, Henri Leisma, Monika Machunik, Tuomo Kakkonen and Jean-Luc LeBrun . 20 ONTS: ”Optima” News Translation System Marco Turchi, Martin Atkinson, Alastair Wilcox, Brett Crawley, Stefano Bucci, Ralf Steinberger and Erik Van der Goot . 25 Just Title It! (by an Online Application) Cedric´ Lopez, Violaine Prince and Mathieu Roche . 31 Folheador: browsing through Portuguese semantic relations Hugo Gonc¸alo Oliveira, Hernani Costa and Diana Santos . 35 A Computer Assisted Speech Transcription System Alejandro Revuelta-Mart´ınez, Luis Rodr´ıguez and Ismael Garc´ıa-Varea . 41 A Statistical Spoken Dialogue System using Complex User Goals and Value Directed Compression Paul A. Crook, Zhuoran Wang, Xingkun Liu and Oliver Lemon . 46 Automatically Generated Customizable Online Dictionaries Eniko˝ Heja´ and David´ Takacs............................................................´ 51 MaltOptimizer: An Optimization Tool for MaltParser Miguel Ballesteros and Joakim Nivre . 58 Fluid Construction Grammar: The New Kid on the Block Remi van Trijp, Luc Steels, Katrien Beuls and Pieter Wellens . 63 A Support Platform for Event Detection using Social Intelligence Timothy Baldwin, Paul Cook, Bo Han, Aaron Harwood, Shanika Karunasekera and Masud Mosh- taghi....................................................................................... 69 NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Extraction Tools Giuseppe Rizzo and Raphael Troncy . 73 Automatic Analysis of Patient History Episodes in Bulgarian Hospital Discharge Letters Svetla Boytcheva, Galia Angelova and Ivelina Nikolova . 77 v ElectionWatch: Detecting Patterns in News Coverage of US Elections Saatviga Sudhahar, Thomas Lansdall-Welfare, Ilias Flaounas and Nello Cristianini . 82 Query log analysis with LangLog Marco Trevisan, Eduard Barbu, Igor Barsanti, Luca Dini, Nikolaos Lagos, Fred´ erique´ Segond, Mathieu Rhulmann and Ed Vald . 87 A platform for collaborative semantic annotation Valerio Basile, Johan Bos, Kilian Evang and Noortje Venhuizen . 92 HadoopPerceptron: a Toolkit for Distributed Perceptron Training and Prediction with MapReduce Andrea Gesmundo and Nadi Tomeh . 97 brat: a Web-based Tool for NLP-Assisted Text Annotation Pontus Stenetorp, Sampo Pyysalo, Goran Topic,´ Tomoko Ohta, Sophia Ananiadou and Jun’ichi Tsujii..................................................................................... 102 vi Conference Program Wednesday, April, 25, 2012 Session 1: (16:10 -16:40) Language Resources Factory: case study on the acquisition of Translation Memo- ries Marc Poch, Antonio Toral and Nuria´ Bel Harnessing NLP Techniques in the Processes of Multilingual Content Management Anelia Belogay, Diman Karagyozov, Svetla Koeva, Cristina Vertan, Adam Przepiorkowski,´ Dan Cristea and Plovios Raxis Collaborative Machine Translation Service for Scientific texts Patrik Lambert, Jean Senellart, Laurent Romary, Holger Schwenk, Florian Zipser, Patrice Lopez and Fred´ eric´ Blain Session 2: (16:50 - 17.20) TransAhead: A Writing Assistant for CAT and CALL Chung-chi Huang, Ping-che Yang, Mei-hua Chen, Hung-ting Hsieh, Ting-hui Kao and Jason S. Chang SWAN - Scientific Writing AssistaNt. A Tool for Helping Scholars to Write Reader- Friendly Manuscripts Tomi Kinnunen, Henri Leisma, Monika Machunik, Tuomo Kakkonen and Jean-Luc LeBrun ONTS: ”Optima” News Translation System Marco Turchi, Martin Atkinson, Alastair Wilcox, Brett Crawley, Stefano Bucci, Ralf Steinberger and Erik Van der Goot Just Title It! (by an Online Application) Cedric´ Lopez, Violaine Prince and Mathieu Roche vii Wednesday, April, 25, 2012 (continued) Session 3: (17:30 - 18:00) Folheador: browsing through Portuguese semantic relations Hugo Gonc¸alo Oliveira, Hernani Costa and Diana Santos A Computer Assisted Speech Transcription System Alejandro Revuelta-Mart´ınez, Luis Rodr´ıguez and Ismael Garc´ıa-Varea A Statistical Spoken Dialogue System using Complex User Goals and Value Directed Com- pression Paul A. Crook, Zhuoran Wang, Xingkun Liu and Oliver Lemon Automatically Generated Customizable Online Dictionaries Eniko˝ Heja´ and David´ Takacs´ Thursday, April, 26, 2012 Session 4: (16:10 -16:40) MaltOptimizer: An Optimization Tool for MaltParser Miguel Ballesteros and Joakim Nivre Fluid Construction Grammar: The New Kid on the Block Remi van Trijp, Luc Steels, Katrien Beuls and Pieter Wellens A Support Platform for Event Detection using Social Intelligence Timothy Baldwin, Paul Cook, Bo Han, Aaron Harwood, Shanika Karunasekera and Masud Moshtaghi NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Extrac- tion Tools Giuseppe Rizzo and Raphael Troncy viii Thursday, April, 26, 2012 (continued) Session 5: (16:50 - 17.20) Automatic Analysis of Patient History Episodes in Bulgarian Hospital Discharge Letters Svetla Boytcheva, Galia Angelova and Ivelina Nikolova ElectionWatch: Detecting Patterns in News Coverage of US Elections Saatviga Sudhahar, Thomas Lansdall-Welfare, Ilias Flaounas and Nello Cristianini Query log analysis with LangLog Marco Trevisan, Eduard Barbu, Igor Barsanti, Luca Dini, Nikolaos Lagos, Fred´ erique´ Segond, Mathieu Rhulmann and Ed Vald Session 6: (17:30 - 18:00) A platform for collaborative semantic annotation Valerio Basile, Johan Bos, Kilian Evang and Noortje Venhuizen HadoopPerceptron: a Toolkit for Distributed Perceptron Training and Prediction with MapReduce Andrea Gesmundo and Nadi Tomeh brat: a Web-based Tool for NLP-Assisted Text Annotation Pontus Stenetorp, Sampo Pyysalo, Goran Topic,´ Tomoko Ohta, Sophia Ananiadou and Jun’ichi Tsujii ix Language Resources Factory: case study on the acquisition of Translation Memories∗ Marc Poch Antonio Toral Nuria´ Bel UPF Barcelona, Spain DCU Dublin, Ireland UPF Barcelona, Spain [email protected] [email protected] [email protected] Abstract 2 Web Services and Workflows This paper demonstrates a novel distributed The factory is designed as a platform of web ser- architecture to facilitate the acquisition of vices (WSs) where the users can create and use Language Resources. We build a factory these services directly or combine them in more that automates the stages involved in the ac- complex chains. These chains are called work- quisition, production, updating and mainte- flows and can represent different combinations of nance of these resources. The factory is de- tasks, e.g. “extract the text from a PDF docu- signed as a platform where functionalities are deployed as web services, which can ment and obtain the Part of Speech (PoS) tagging” be combined in complex acquisition chains or “crawl this bilingual website and align its sen- using workflows. We show a case study, tence pairs”. Each task is carried out using NLP which acquires a Translation Memory for a tools deployed as WSs in the factory. given pair of languages and a domain using Web Service Providers (WSPs) are institutions web services for crawling, sentence align- (universities, companies, etc.) who are willing ment and conversion to TMX. to offer services for some tasks. WSs are ser- vices made available from a web server to re- 1 Introduction mote users or to other connected programs. WSs are built upon protocols, server and program- A fundamental issue for many tasks in the field of ming languages. Their massive adoption has con- Computational Linguistics and Language Tech- tributed to make this technology rather interoper- nologies in general is the lack of Language Re- able and open. In fact, WSs allow computer pro- sources (LRs) to tackle them successfully, espe- grams distributed in different locations to interact cially for some languages and domains. It is the with each other. so-called LRs bottleneck. WSs introduce a completely new paradigm in Our objective is to build a factory of LRs that the way we use software tools. Before, every automates the stages involved in the acquisition, researcher or laboratory had to install and main- production, updating and maintenance of LRs tain all the different tools that they needed for required by Machine Translation (MT), and by their work, which has a considerable cost in both