Tagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of Icelandic
2013 (English)In: Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013), Linköping University Electronic Press, Linköpings universitet, 2013, 105-119 p.Conference paper (Refereed)
In this paper, we experiment with using Stagger, an open-source implementation of an Averaged Perceptron tagger, to tag Icelandic, a morphologically complex language. By adding languagespecific linguistic features and using IceMorphy, an unknown word guesser, we obtain state-of- the-art tagging accuracy of 92.82%. Furthermore, by adding data from a morphological database, and word embeddings induced from an unannotated corpus, the accuracy increases to 93.84%. This is equivalent to an error reduction of 5.5%, compared to the previously best tagger for Icelandic, consisting of linguistic rules and a Hidden Markov Model.
Place, publisher, year, edition, pages
Linköping University Electronic Press, Linköpings universitet, 2013. 105-119 p.
, Linköping Electronic Conference Proceedings, ISSN 1650-3740
part of speech tagging, pos tagging, icelandic
Language Technology (Computational Linguistics)
Research subject Computational Linguistics
IdentifiersURN: urn:nbn:se:su:diva-90304OAI: oai:DiVA.org:su-90304DiVA: diva2:624559
19th Nordic Conference of Computational Linguistics (NODALIDA 2013)