SUC-CORE: A Balanced Corpus Annotated with Noun Phrase Coreference
2013 (English)In: Northern European Journal of Language Technology (NEJLT), ISSN 2000-1533, Vol. 3, no 2, 19-39 p.Article in journal (Refereed) Published
This paper describes SUC-CORE, a subset of the Stockholm Umeå Corpus and the Swedish Treebank annotated with noun phrase coreference. While most coreference annotated corpora consist of texts of similar types within related domains, SUC-CORE consists of both informative and imaginative prose and covers a wide range of literary genres and domains.This allows for exploration of coreference across different text types, but it also means that there are limited amounts of data within each type. Future work on coreference resolution for Swedish should include making more annotated data available for the research community.
Place, publisher, year, edition, pages
2013. Vol. 3, no 2, 19-39 p.
Language Technology (Computational Linguistics)
Research subject Computational Linguistics
IdentifiersURN: urn:nbn:se:su:diva-90229DOI: 10.3384/nejlt.2000-1533.1332OAI: oai:DiVA.org:su-90229DiVA: diva2:623882