Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
SVALA: Annotation of Second-Language Learner Text Based on Mostly Automatic Alignment of Parallel Corpora
Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.ORCID iD: 0000-0003-4040-3544
2019 (English)In: Selected papers from the CLARIN Annual Conference 2018, Pisa, 8-10 October 2018 / [ed] Inguna Skadina, Maria Eskevich, Linköping: Linköping University Electronic Press, 2019, p. 222-234, article id 023Conference paper, Published paper (Refereed)
Abstract [en]

Annotation of second-language learner text is a cumbersome manual task which in turn requires interpretation to postulate the intended meaning of the learner’s language. This paper describes SVALA, a tool which separates the logical steps in this process while providing rich visual support for each of them. The first step is to pseudonymize the learner text to fulfil the legal and ethical requirements for a distributable learner corpus. The second step is to correct the text, which is carried out in the simplest possible way by text editing. During the editing, SVALA automatically maintains a parallel corpus with alignments between words in the learner source text and corrected text, while the annotator may repair inconsistent word alignments. Finally, the actual labelling of the corrections (the postulated errors) is performed. We describe the objectives, design and workflow of SVALA, and our plans for further development.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2019. p. 222-234, article id 023
Series
Linköping Electronic Conference Proceedings, ISSN 1650-3686, E-ISSN 1650-3740 ; 159
Keywords [en]
Normalization, Error annotation, Learner corpora, Parallel corpora, Word alignment
National Category
General Language Studies and Linguistics
Research subject
Computational Linguistics
Identifiers
URN: urn:nbn:se:su:diva-170363ISBN: 978-91-7685-034-3 (print)OAI: oai:DiVA.org:su-170363DiVA, id: diva2:1332091
Conference
CLARIN Annual Conference, Pisa, Italy, 8-10 October, 2018
Funder
Riksbankens Jubileumsfond, IN16- 0464:1Available from: 2019-06-27 Created: 2019-06-27 Last updated: 2019-06-28Bibliographically approved

Open Access in DiVA

fulltext(1276 kB)12 downloads
File information
File name FULLTEXT01.pdfFile size 1276 kBChecksum SHA-512
6cfc7d1f4e7d9ea82103c36d8b61e0ffb2a8ae4d82035854fe38a7ec661d906555e92ecbb8798c1a2af88b05f4a81f448c68f19f95de257645cb76e4c733fb64
Type fulltextMimetype application/pdf

Other links

Free full text

Search in DiVA

By author/editor
Wirén, MatsVolodina, Elena
By organisation
Computational Linguistics
General Language Studies and Linguistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 12 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 195 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf