Dialogue & Discourse 7(2) (2016) 1-28 doi: 10.5087/dad.2016.201 A step-wise approach to discourse annotation: Towards a reliable categorization of coherence relations Merel C.J. Scholman
[email protected] Computational Linguistics and Phonetics, Saarland University Campus C7.4, 66123, Saarbrücken, Germany Jacqueline Evers-Vermeul
[email protected] Utrecht Institute of Linguistics OTS, Utrecht University Trans 10, 3512 JK, Utrecht, The Netherlands Ted J.M. Sanders
[email protected] Utrecht Institute of Linguistics OTS, Utrecht University Trans 10, 3512 JK, Utrecht, The Netherlands Editor: Barbara Di Eugenio Submitted 11/2014; Accepted 12/2015; Published online 02/2016 Abstract Over the last decade, annotating coherence relations has gained increasing interest of the linguistics research community. Often, trained linguists are employed for discourse annotation tasks. In this article, we investigate whether non-trained, non-expert annotators are capable of annotating coherence relations. For this goal, substitution and paraphrase tests are introduced that guide annotators during the process, and a systematic, step-wise annotation scheme is proposed. This annotation scheme is based on the cognitive approach to coherence relations (Sanders et al., 1992, 1993), which consists of a taxonomy of coherence relations in terms of four cognitive primitives. The reliability of this annotation scheme is tested in an annotation experiment with 40 non-trained, non-expert annotators. The results show that two of the four primitives, polarity and order of the segments, can be applied reliably by non-trained annotators. The other two primitives, basic operation and source of coherence, are more problematic.