Australasian Language Technology Association Workshop 2012
Total Page:16
File Type:pdf, Size:1020Kb
Australasian Language Technology Association Workshop 2012 Proceedings of the Workshop Editors: Paul Cook Scott Nowson !"# Dece%&er 2012 'tago Uni)ersity Dunedin, New +ealand Australasian Language Technology Association Workshop 2012 (ALTA 2012) http://www.alta.asn.au/events/alta2012 Sponsors: Volume 10, 2012 ISSN: 1!"#$%0"% II ALTA 2012 Workshop Committees Workshop Co-Chairs Paul Cook (The University of Melbourne) • Scott Nowson (Appen Butler Hill) • Workshop Local 'rganiser Alistair !nott (University of "tago) • Progra%%e Committee Timothy Baldwin (University of Melbourne) • Lawrence Cavedon (NICTA and RMIT University) • Nathalie Colineau (CSIR" ( ICT Centre) • Rebecca )ri%an (University of "slo) • Alex Chengyu +ang (The City University of ong !ong) • Nitin In%urkhya (UNS,) • -ong-Bok !i$ (!yun# Hee University) • Alistair !nott (University of "tago) • "i .ee Kwong (City University of Hon# !ong) • +rancois Lareau (Macquarie University) • -ey Han Lau (University of Melbourne) • +an# Li (Shanghai -iao Ton# University) • Haizhou Li (Institute for Infocomm Research) • Marco Lui (University of Melbourne) • Ruli Manurun# (Universitas Indonesia) • )avi% Martine0 (NICTA VRL) • Tara McIntosh (,avii) • Meladel Mistica (The Australian National University) • )iego Moll2a(Macquarie University) • Su Nam !im (Monash University) • Lui0 Augusto Pi0zato (University of Sydney) • )avi% Powers (+lin%ers University) • Stijn )e Saeger (National Institute of Information an% Communications Technology) • Andrea Schalley (4riffith University) • Rolf Schwitter (Macquarie University) • Tony Smith (,aikato University) • Virach Sornlertla$vanich (National Electronics an% Computer Technology Center) • Hanna Suominen (NICTA) • !arin 1erspoor (National ICT Australia) • III Preface The precious volume you are currently readin# contains the papers accepte% for presentation at the Australasian Language Technology Association ,orkshop (ALTA) 7897: hel% at the University of "tago in Dunedin, New Zealan% on <=> )ece$ber 7897? ,e are excited that this tenth anniversary edition of the A&TA ,orkshop sees ALTA leavin# Australia for the first time: an% becomin# a truly Australasian workshop. Sadly we say goo%bye to the Aussie bush hat on the conference webpage: but it is in the spirit of Mick “Croco%ile” )un%ee that we cross the Tasman? The goals of the workshop are to: brin# together the growin# Language Technology (LT) com$unity in the Australasian • region and encourage interactions; encourage interactions an% collaboration within this com$unity and with the wider in( • ternational &T community; foster interaction between academic and industrial researchers: to encourage %issemina( • tion of research results; provi%e a forum for stu%ents an% young researchers to present their research; • facilitate the %iscussion of new an% ongoin# research and pro3ects; • provi%e an opportunity for the broader artificial intelligence com$unity to become aware • of local &T researchD and: @nally: increase visibility of &T research in Australasia an% overseas? • This yearEs ALTA ,orkshop presents 9< peer-reviewe% papers: inclu%in# eleven full and three short papers? ,e receive% a total of 9F submissions? 6ach paper: full an% short: was reviewed by at least three $embers of the program committee. ,ith the more-international Gavour of the workshop, this yearEs program committee consiste% of more $embers from outsi%e of Australia and New Zealan% than in past years. The reviewin# for the workshop was %ouble blin%: an% done in accordance with the )IISRT6 requirements for 69 conference publications? +urthermore, great care was taken to avoi% all conGicts of interestD in particular: no paper was assesse% by a reviewer from the same institution as any of the authors. In the case of submissions by a programme co-chair, the %ouble-blin% review process was uphel%: an% acceptance decisions were made by the non-author co-chair? In addition to peer-reviewe% papers, the proceedings include the abstracts of the invite% talks by -en ay (University of Canterbury) an% Chris Brockett (Microsoft Research): both of whom we are honoured to welcome to ALTA. Also within: you will @nd an overview of the ALTA Shared Task and three system descriptions by shared task participants. These contributions were not peer-reviewe%? ,e woul% like to thank: in no particular or%erC all of the authors who submitte% papers to ALTA; the fellowship of the program committee for the time and effort they put into main( tainin# the high stan%ards of our reviewin# processD our Man In )unedin: the local organiser Alistair Knott for takin# care of all the physical logistics an% linin# up some great social events; our invite% speakers -en ay an% Chris Brockett for agreein# to share their wisdom with usD the team from NICTA an% James Curran for agreeing to host two fascinatin# tutorials, and; )iego Moll2aand )avi% Martine0: the program co-chairs of ALTA 7899: for their valuable help an% support? ,e woul% like to acknowledge the constant support an% advice of the out-goin# ALTA 6*ecutive Committee an% in particular President Ti$othy Bal%win? +inally: we gratefully recognise our sponsors: NICTA, Microsoft Research, Appen Butler ill: an% the University of "tago. Their generous support enabled us to offer travel subsidies to si* stu%ents to atten% and present at ALTA? Paul Cook an% Scott Nowson Programme Co-Chairs IV ALTA 2012 Progra%me The proceedings are available online at http://www.alta.asn.au/events/alta2012/proceedings/ Tuesday ! Decem&er 2012 Pre-workshop tutorials Biomedical Natural Language Processing ("wheo 106) A Crash Course in Statistical Natural Language Processin# (Lab +) Wednesday - Decem&er 2012 8FCI8=8JC88 "penin# remarks (Owheo 106) 8JC88=98C88 Invite% talk ("wheo 98>D Chair: Paul Cook) -ennifer Hay Using a large annotated historical corpus to study word-specific effects in sound change 98C88=98CK8 Coffee Session 9 ("wheo 98>D ChairC Ingri% Zukerman) 98CK8=99C88 Angrosh M.A.: Stephen Cranefiel% and Nigel Stanger A Citation Centric Annotation Scheme for Scientific Articles 99C88=99CK8 Michael Sy$onds, 4uido Zuccon, Bevan Koopman: Peter Bruza an% Anthony Nguyen Semantic Judgement of Medical Concepts: Combining Syntagmatic and Paradig- matic nformation with the !ensor "ncoding Model 99CK8=97C88 Teresa Lynn, -ennifer +oster: Mark )ras an% 6laine U2L)honnchadha Active $earning and the Irish !reeban% 97C88=9KCK8 Lunch Session 7 ("wheo 98>D ChairC )iego Moll´a) 9KCK8=9<C88 Marco Lui, Ti$othy Bal%win and )iana McCarthy Unsupervised "stimation of &ord Usage Similarity 9<C88=9<CK8 Mary 4ar%iner an% Mark )ras 'alence Shifting: s It A 'alid !as%( 9<CK8=9IC88 ALTA 2012 best paper Minh )uc Cao and Ingri% Zukerman ")perimental Evaluation of a $exicon- and Corpus-based "nsemble for Multi-way Sentiment Analysis 9IC88=9ICK8 Coffee Session K ("wheo 98>D ChairC Chris Brockett) 9ICK8=9>C88 -ames Breen: Timothy Baldwin an% +rancis Bond ")traction and !ranslation of Japanese Multi-word $oanwords 9>C88=9>CK8 Yvette Graham, Timothy Bal%win, Aaron Harwoo%: Alistair Moffat an% -ustin Zobel Measurement of Progress in Machine !ranslation 9>CK8=9MCK8 ALTA business meeting ("wheo 106) 9JCK8= Conference dinner (+iladel@o’s: K North Road) 1 Thursday # Dece%&er 2012 8JC88=98C88 Invite% talk ("wheo 78>D Chair: Timothy Baldwin) Chris Brockett Diverse &ords, Shared Meanings: Statistical Machine !ranslation for Para- phrase, Grounding, and Intent 98C88=98CK8 Coffee Session <C ALTA/ADCS share% session ("wheo 98>D Chair: Alistair !nott) 98CK8=99C88 ADCS paper &ida 4hahremanloo, -ames Thom an% Liam Magee An Ontology Derived from Heterogeneous Sustainability Indicator Set Documents 99C88=99CK8 ADCS paper Bevan !oopman: Peter Bruza, Guido ;uccon: Michael -ohn Law- ley an% Laurianne Sitbon ,raph-based Concept &eighting for Medical Information /etrieval 99CK8=97C88 Abeed Sarker: )iego Moll2a-Alio% and Cecile Paris !owards !wo-step Multi-document Summarisation for "#idence 0ased Medicine: A 1uantitative Analysis 97C88=97CK8 Alex 4? Smith: Christopher O? S? ;ee an% Alexandra &? Uitdenboger% n 2our Eyes: dentifying Clich3esin Song $yrics 97CK8=9<C88 Lunch Session IC ALTA Shared Task an% poster boasters ("wheo 78>D ChairC !arin 1erspoor) 9<C88=9<CK8 Iman Amini: )avi% Martinez and )iego Molla AL!A 4564 Shared !as% overview 9<CK8=9<CI8 ALTA poster boasters Paul Cook an% Marco Lui langid.py for better language modelling Robert +romont an% -ennifer Hay $aB0-CA!: an Annotation Store -enny Mcdonal%: Alistair !nott an% Richard Zeng 8ree-text input vs menu selection: exploring the difference with a tutorial dialogue system7 -ared ,illett: Ti$othy Bal%win: )avi% Martinez and Angus ,ebb Classification of Study /egion in Environmental Science Abstracts ALTA Share% Task poster boasters Marco Lui 8eature Stacking for Sentence Classification in Evidence-based Medicine Abee% Sarker Multi-class classification of medical sentences using SVMs 9<CI8=9IC88 Awards an% @nal remarks ("wheo 206) 9IC88=9ICK8 Coffee 9ICK8=9MC88 Poster session with ADCS ("wheo 106) 9JCK8=79CK8 Boat trip: Meet at 9JC88 at the wharf: 78 +ryatt St? VI Contents Invited talks 1 Using a large annotated historical corpus to study word-specific effects in sound change Jennifer Hay 2 Diverse &ords+ Shared Meanings: Statistical Machine !ranslation for Paraphrase, ,round- ing, and ntent Chris Brockett !"ll papers # A Citation Centric Annotation Scheme for Scientific Articles Angrosh %&A.' Stephen Cranefield and Nigel Stanger * Semantic Judgement of Medical Concepts: Combining Syntagmatic and Paradigmatic n- formation with the !ensor Encoding