Bayesian Word Alignment for Massively Parallel Texts
2014 (English)In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, Association for Computational Linguistics, 2014, 123-127 p.Conference paper (Refereed)
There has been a great amount of work done in the field of bitext alignment, but the problem of aligning words in massively parallel texts with hundreds or thousands of languages is largely unexplored. While the basic task is similar, there are also important differences in purpose, method and evaluation between the problems. In this work, I present a non-parametric Bayesian model that can be used for simultaneous word alignment in massively parallel corpora. This method is evaluated on a corpus containing 1144 translations of the New Testament.
Place, publisher, year, edition, pages
Association for Computational Linguistics, 2014. 123-127 p.
word alignment, bayesian models, nonparametric models, gibbs sampling, parallel corpora, massively parallel corpora
Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:su:diva-103011OAI: oai:DiVA.org:su-103011DiVA: diva2:714290
14th Conference of the European Chapter of the Association for Computational Linguistics