Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
SWEGRAM: A Web-Based Tool for Automatic Annotation and Analysis of Swedish Texts
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology. (språkteknologi)ORCID iD: 0000-0002-4838-6518
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Scandinavian Languages.
2017 (English)In: Proceedings of the 21st Nordic Conference on Computational Linguistics, Nodalida 2017., Göteborg, 2017, p. 132-141Conference paper, Published paper (Refereed)
Abstract [en]

We present SWEGRAM, a web-based tool for the automatic linguistic annotation and quantitative analysis of Swedish text, enabling researchers in the humanities and social sciences to annotate their own text and produce statistics on linguistic and other text-related features on the basis of this annotation. The tool allows users to upload one or several documents, which are automatically fed into a pipeline of tools for tokenization and sentence segmentation, spell checking, part-of-speech tagging and morpho-syntactic analysis as well as dependency parsing for syntactic annotation of sentences. The analyzer provides statistics on the number of tokens, words and sentences, the number of parts of speech (PoS), readability measures, the average length of various units, and frequency lists of tokens, lemmas, PoS, and spelling errors. SWEGRAM allows users to create their own corpus or compare texts on various linguistic levels.

Place, publisher, year, edition, pages
Göteborg, 2017. p. 132-141
Series
Linköping Electronic Conference Proceedings, ISSN 1650-3686, E-ISSN 1650-3740 ; 131
Keyword [en]
NLP, automatic linguistic annotation, quantitative text analysis
National Category
Language Technology (Computational Linguistics)
Research subject
Computational Linguistics
Identifiers
URN: urn:nbn:se:uu:diva-337519ISBN: 978-91-7685-601-7 (electronic)OAI: oai:DiVA.org:uu-337519DiVA, id: diva2:1169892
Conference
21st Nordic Conference on Computational Linguistics, Nodalida 2017
Projects
SWE-CLARIN
Available from: 2017-12-30 Created: 2017-12-30 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

fulltext(273 kB)33 downloads
File information
File name FULLTEXT01.pdfFile size 273 kBChecksum SHA-512
343a7a3508d4066fb7feee0dcf9d323d2128a597f5897be9a41a06406e9b5e43503035f9f60b9445282df052bfc8db6fb2b8b1c6ae06bd493c1243e0569a29bf
Type fulltextMimetype application/pdf

Other links

http://www.ep.liu.se/ecp/contents.asp?issue=131

Search in DiVA

By author/editor
Näsman, JesperMegyesi, BeátaPalmér, Anne
By organisation
Department of Linguistics and PhilologyDepartment of Scandinavian Languages
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 33 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 36 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf