Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
SWORD: Towards Cutting-Edge Swedish Word Processing
Show others and affiliations
2016 (English)In: Proceedings of SLTC 2016, 2016Conference paper, Published paper (Refereed)
Abstract [en]

Despite many years of research on Swedish language technology, there is still no well-documented standard for Swedish word processing covering the whole spectrum from low-level tokenization to morphological analysis and disambiguation. SWORD is a new initiative within the SWE-CLARIN consortium aiming to develop documented standards for Swedish word processing. In this paper, we report on a pilot study of Swedish tokenization, where we compare the output of six different tokenizers on four different text types. For one text type (Wikipedia articles), we also compare to the tokenization produced by six manual annotators.

Place, publisher, year, edition, pages
2016.
Keywords [en]
Tokenization, morphological analysis
National Category
General Language Studies and Linguistics Language Technology (Computational Linguistics)
Research subject
Computational Linguistics
Identifiers
URN: urn:nbn:se:su:diva-137054OAI: oai:DiVA.org:su-137054DiVA, id: diva2:1058883
Conference
SLTC 2016 - The Sixth Swedish Language Technology Conference (SLTC) Umeå, Sweden, 17-18 November, 2016
Projects
SWE-CLARIN
Funder
Swedish Research Council, 821-2013-2003Available from: 2016-12-21 Created: 2016-12-21 Last updated: 2018-04-18Bibliographically approved

Open Access in DiVA

fulltext(114 kB)57 downloads
File information
File name FULLTEXT01.pdfFile size 114 kBChecksum SHA-512
28a70446eabd18e2afad1a9ff7d1d016e3c796bc421d08e99e7f52880f57c9c4e66f62c2af68520d2fad65950f644b95f7e3df95ff21313c91952e0b22e696d4
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Adesam, YvonneAhrenberg, LarsBorin, LarsBouma, GerlofKann, ViggoÖstling, RobertSmith, AaronWirén, MatsNivre, Joakim
By organisation
Computational Linguistics
General Language Studies and LinguisticsLanguage Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 57 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 221 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf