Improving machine translation quality prediction with syntactic tree kernels
2011 (English)In: Proceedings of the 15th conference of the European Association for Machine Translation (EAMT 2011) / [ed] Mikel L. Forcada, Heidi Depraetere, Vincent Vandeghinste, European Association for Machine Translation (EAMT), 2011, 233-240 p.Conference paper (Refereed)
We investigate the problem of predicting the quality of a given Machine Translation (MT) output segment as a binary classification task. In a study with four different data sets in two text genres and two language pairs, we show that the performance of a Support Vector Machine (SVM) classifier can be improved by extending the feature set with implicitly defined syntactic features in the form of tree kernels over syntactic parse trees. Moreover, we demonstrate that syntax tree kernels achieve surprisingly high performance levels even without additional features, which makes them suitable as a low-effort initial building block for an MT quality estimation system.
Place, publisher, year, edition, pages
European Association for Machine Translation (EAMT), 2011. 233-240 p.
MT quality prediction, Tree kernels
Language Technology (Computational Linguistics)
Research subject Computational Linguistics
IdentifiersURN: urn:nbn:se:uu:diva-162883OAI: oai:DiVA.org:uu-162883DiVA: diva2:462153
EAMT 2011, Leuven, Belgium, May 30, 2011 - May 31, 2011