Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Visually Guided Extraction of Prevalent Topics
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). (ISOVIS, DISA)ORCID iD: 0000-0001-6150-0787
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). Blekinge Institute of Technology, Sweden.ORCID iD: 0000-0001-6745-4398
Linköping University, Sweden.ORCID iD: 0000-0002-1907-7820
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). Linköping University, Sweden. (ISOVIS, DISA)ORCID iD: 0000-0002-0519-2537
2025 (English)In: Information Visualization, ISSN 1473-8716, E-ISSN 1473-8724, Vol. 42, no 2, p. 179-198Article in journal (Refereed) Published
Abstract [en]

The sensemaking process of large sets of text documents is highly challenging for tasks such as obtaining a comprehensive overview or keeping up with the most important trends and topics. Even though several established methods for condensation and summarization of large text corpora exist, many of them lack the ability to account for difference in prevalence between identified topics, which in turn impedes quantitative analysis. In this paper, we therefore propose a novel prevalence-aware method for topic extraction, and show how it can be used to obtain important insights from two text corpora with very different content. We also implemented a prototype visual analytics tool which guides the user in the search for relevant insights and promotes trust in the yielded results. We have verified our application by a user study, as well as by a validation run on a data set with previously known topic structure. The results clearly show that our approach is suitable for text mining, that is can be used by non-experts, and that it offers features which makes it an interesting candidate for use in several different analyze scenarios.

Place, publisher, year, edition, pages
SAGE Publications , 2025. Vol. 42, no 2, p. 179-198
Keywords [en]
Visual Analytics, Text Mining, Text Embedding, Topic Modelling, Similarity Calculations
National Category
Computer Sciences Human Computer Interaction
Research subject
Computer Science, Information and software visualization
Identifiers
URN: urn:nbn:se:lnu:diva-136101DOI: 10.1177/14738716241312400ISI: 001408697200001Scopus ID: 2-s2.0-85216198128OAI: oai:DiVA.org:lnu-136101DiVA, id: diva2:1935944
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile Communications
Note

This work was partially supported through the ELLIIT environment for strategic research in Sweden. The work of Ilir Jusufi was supported in part by the Knowledge Foundation, Sweden, through the project ”Rekryteringar 21, Universitetslektor i spelteknik” under Contract 20210077.

Available from: 2025-02-09 Created: 2025-02-09 Last updated: 2025-04-14Bibliographically approved

Open Access in DiVA

fulltext(4340 kB)4 downloads
File information
File name FULLTEXT01.pdfFile size 4340 kBChecksum SHA-512
134c8472bbad07ec716ab10ef6b3b5a33f4dff8d8008d9ed73f974acd5b5bc50bb561c67494a4cb3af9d3508120fdda49bafbb11cf76d1354625aa3302f3216b
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Witschard, DanielJusufi, IlirKucher, KostiantynKerren, Andreas
By organisation
Department of computer science and media technology (CM)
In the same journal
Information Visualization
Computer SciencesHuman Computer Interaction

Search outside of DiVA

GoogleGoogle Scholar
Total: 4 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 18 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf