Translation Quality Assessment: A Brief Survey on Manual and Automatic Methods Lifeng Han1, Gareth J. F. Jones1, and Alan F. Smeaton2 1 ADAPT Research Centre 2 Insight Centre for Data Analytics School of Computing, Dublin City University, Dublin, Ireland
[email protected] Abstract tical methods (Brown et al., 1993; Och and Ney, 2003; Chiang, 2005; Koehn, 2010), to the cur- To facilitate effective translation modeling rent paradigm of neural network structures (Cho and translation studies, one of the crucial et al., 2014; Johnson et al., 2016; Vaswani et al., questions to address is how to assess trans- 2017; Lample and Conneau, 2019), MT quality lation quality. From the perspectives of ac- continue to improve. However, as MT and transla- curacy, reliability, repeatability and cost, tion quality assessment (TQA) researchers report, translation quality assessment (TQA) it- MT outputs are still far from reaching human par- self is a rich and challenging task. In this ity (Läubli et al., 2018; Läubli et al., 2020; Han work, we present a high-level and con- et al., 2020a). MT quality assessment is thus still cise survey of TQA methods, including an important task to facilitate MT research itself, both manual judgement criteria and auto- and also for downstream applications. TQA re- mated evaluation metrics, which we clas- mains a challenging and difficult task because of sify into further detailed sub-categories. the richness, variety, and ambiguity phenomena of We hope that this work will be an asset natural language itself, e.g. the same concept can for both translation model researchers and be expressed in different word structures and pat- quality assessment researchers.