Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Application of a topic model visualisation tool to a second language
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). The Institute for Language and Folklore, Sweden;Hokkaido University, Japan.ORCID iD: 0000-0001-6164-7762
The Institute for Language and Folklore, Sweden.
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). (ISOVIS;DISA-DH)ORCID iD: 0000-0002-0519-2537
Hokkaido University, Japan;RIKEN Center for Advanced Intelligence Project (AIP), Japan.
Show others and affiliations
2019 (English)In: CLARIN 2019 Book of absracts, CLARIN, Common Language Resources and Technology Infrastructure , 2019Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

We explored adaptions required for applying a topic modelling tool to a language that is very different from the one for which the tool was originally developed. The tool, which enables text analysis on the output of topic modelling, was developed for English, and we here applied it on Japanese texts. As white space is not used for indicating word boundaries in Japanese, the texts had to be pre-tokenised and white space inserted to indicate a token segmentation, before the texts could be imported into the tool. The tool was also extended by the addition of word translations and phonetic readings to support users who are second-language speakers of Japanese.

Place, publisher, year, edition, pages
CLARIN, Common Language Resources and Technology Infrastructure , 2019.
Keywords [en]
Topic Models, Visualization, Japanese, Text Mining, Visual Text Analysis
National Category
Natural Language Processing Human Computer Interaction
Research subject
Computer Science, Information and software visualization
Identifiers
URN: urn:nbn:se:lnu:diva-87108OAI: oai:DiVA.org:lnu-87108DiVA, id: diva2:1340988
Conference
CLARIN Annual Conference 2019, 30 September - 2 October 2019, Leipzig, Germany
Projects
DISA-DH
Funder
Swedish Research Council, 2017-00626Available from: 2019-08-07 Created: 2019-08-07 Last updated: 2025-02-01Bibliographically approved

Open Access in DiVA

fulltext(682 kB)157 downloads
File information
File name FULLTEXT01.pdfFile size 682 kBChecksum SHA-512
a112a6e7e7972fd9be0494754dd33e1012ce45303ce9712249699726a0e9f0ea1ac7d931bb429fa9252b4c069519352f7d1eeb7fb01aa6586205b21455e3cf53
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Skeppstedt, MariaKerren, Andreas
By organisation
Department of computer science and media technology (CM)
Natural Language ProcessingHuman Computer Interaction

Search outside of DiVA

GoogleGoogle Scholar
Total: 159 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 277 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf