Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Visualizing dynamic text corpora using Virtual Reality
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). (VRxAR Labs)ORCID iD: 0000-0003-4162-6475
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM). (VRxAR Labs)ORCID iD: 0000-0001-7485-8649
University of Eastern Finland, Finland.ORCID iD: 0000-0003-3123-6932
Linnaeus University, Faculty of Arts and Humanities, Department of Languages.
Show others and affiliations
2018 (English)In: ICAME 39 : Tampere, 30 May – 3 June, 2018: Corpus Linguistics and Changing Society : Book of Abstracts, Tampere: University of Tampere , 2018, p. 205-205Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

In recent years, data visualization has become a major area in Digital Humanities research, and the same holds true also in linguistics. The rapidly increasing size of corpora, the emergence of dynamic real-time streams, and the availability of complex and enriched metadata have made it increasingly important to facilitate new and innovative approaches to presenting and exploring primary data. This demonstration showcases the uses of Virtual Reality (VR) in the visualization of geospatial linguistic data using data from the Nordic Tweet Stream (NTS) project (see Laitinen et al 2017). The NTS data for this demonstration comprises a full year of geotagged tweets (12,443,696 tweets from 273,648 user accounts) posted within the Nordic region (Denmark, Finland, Iceland, Norway, and Sweden). The dataset includes over 50 metadata parameters in addition to the tweets themselves.

We demonstrate the potential of using VR to efficiently find meaningful patterns in vast streams of data. The VR environment allows an easy overview of any of the features (textual or metadata) in a text corpus. Our focus will be on the language identification data, which provides a previously unexplored perspective into the use of English and other non-indigenous languages in the Nordic countries alongside the native languages of the region.

Our VR prototype utilizes the HTC Vive headset for a room-scale VR scenario, and it is being developed using the Unity3D game development engine. Each node in the VR space is displayed as a stacked cuboid, the equivalent of a bar chart in a three-dimensional space, summarizing all tweets at one geographic location for a given point in time (see: https://tinyurl.com/nts-vr). Each stacked cuboid represents information of the three most frequently used languages, appropriately color coded, enabling the user to get an overview of the language distribution at each location. The VR prototype further encourages users to move between different locations and inspect points of interest in more detail (overall location-related information, a detailed list of all languages detected, the most frequently used hashtags). An underlying map outlines country borders and facilitates orientation. In addition to spatial movement through the Nordic areas, the VR system provides an interface to explore the Twitter data based on time (days, weeks, months, or time of predefined special events), which enables users to explore data over time (see: https://tinyurl.com/nts-vr-time).

In addition to demonstrating how the VR methods aid data visualization and exploration, we will also briefly discuss the pedagogical implications of using VR to showcase linguistic diversity.

Place, publisher, year, edition, pages
Tampere: University of Tampere , 2018. p. 205-205
Keywords [en]
virtual reality, Nordic Tweet Stream, digital humanities
National Category
General Language Studies and Linguistics Human Computer Interaction Language Technology (Computational Linguistics)
Research subject
Computer Science, Information and software visualization; Humanities, Linguistics
Identifiers
URN: urn:nbn:se:lnu:diva-75064OAI: oai:DiVA.org:lnu-75064DiVA, id: diva2:1213822
Conference
The 39th Annual Conference of the International Computer Archive for Modern and Medieval English (ICAME39): Corpus Linguistics and Changing Society. Tampere, 30 May - 3 June, 2018
Projects
DISA-DHOpen Data Exploration in Virtual Reality (ODxVR)Available from: 2018-06-05 Created: 2018-06-05 Last updated: 2018-07-23Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Book of Abstracts

Search in DiVA

By author/editor
Alissandrakis, ArisReski, NicoLaitinen, MikkoTyrkkö, JukkaLevin, MagnusLundberg, Jonas
By organisation
Department of computer science and media technology (CM)Department of Languages
General Language Studies and LinguisticsHuman Computer InteractionLanguage Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 299 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf