ClassifierGuesser: A Context-based Classifier Prediction System for Chinese Language Learners Nicole Peinelt1,2 and Maria Liakata1,2 and Shu-Kai Hsieh3 1The Alan Turing Institute, London, UK 2Department of Computer Science, University of Warwick, Coventry, UK 3Graduate Institute of Linguistics, National Taiwan University, Taipei, Taiwan {n.peinelt, m.liakata},
[email protected] Abstract 2011), as well as an SVM with syntactic and ontological features (Guo and Zhong, 2005). Classifiers are function words that are However, without any context classifier assign- used to express quantities in Chinese ment can be ambiguous. For instance, the and are especially difficult for language noun 球 ‘ball’ can be modified by ke - a clas- learners. In contrast to previous stud- sifier for round objects - when referring tothe ies, we argue that the choice of clas- object itself as in (1), but requires the event sifiers is highly contextual and train classifier chang in the context of a ball match context-aware machine learning mod- as in (2). We argue that context is an impor- els based on a novel publicly available tant factor for classifier selection, since a head dataset, outperforming previous base- word may have multiple associated classifiers, lines. We further present use cases for but the final classifier selection is restricted by our database and models in an interac- the context. tive demo system. (1) 一 颗 红色 的 球 1 Introduction one ke red DE ball Languages such as Chinese are characterized ‘a red ball’ by the existence of a class of words commonly (2) 一 场 精彩 的 球 referred to as ‘classifiers’ or ‘measure words’.