Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Unveiling Online Extremism: Leveraging RoBERTa for Detecting Digital Violent Right-Wing Extremism
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Violent right-wing extremism (VRWE) poses a significant challenge, particularly in the digital environment, which has become a breeding ground for the propagation of VRW ideologies. While the internet facilitates the spread of VRWE, increased attention has been given to identifying violent extremists online before they engage in offline violence. Due to the huge amount of online data and the evolving nature of language, there is a pressing need to implement automated measures to detect and identify online VRWE content. Machine learning (ML) based tools have proven effective in detecting violent threats on the internet. Nevertheless, there is hardly any dataset or ML model exclusively focused on threats targeting the entire spectrum of VRWE ideologies. This research aims to train and evaluate an ML model based on RoBERTa to identify linguistic patterns associated with VRWE in online environments. The research employs the design science research methodology to achieve the research goal.

To fine-tune the RoBERTa model, a dataset containing 3000 posts from the far-right extremism forums Iron March and Stormfront, alongside Twitter, was created. The dataset underwent cleaning and annotation. Approximately 45% of the dataset posts were classified as VRWE. The fine-tuned RoBERTa model was evaluated based on unseen data from social media. The evaluation results showed that the model performed relatively well and reached 87% accuracy, although classifying VRWE content remains complex due to its subtle nature. Most actual VRWE posts were accurately identified, with few posts of non-VRWE content misclassified as VRWE. The model can be utilized by online social platforms as an initial filter followed by a manual review to enhance VRWE detection reliability. Future research can improve the model's performance and reliability by expanding and updating the dataset with the latest VRWE language.

Place, publisher, year, edition, pages
2024.
Keywords [en]
violent right-wing extremism, machine learning, RoBERTa, text classification
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:su:diva-242841OAI: oai:DiVA.org:su-242841DiVA, id: diva2:1955774
Available from: 2025-04-30 Created: 2025-04-30

Open Access in DiVA

fulltext(396 kB)18 downloads
File information
File name FULLTEXT01.pdfFile size 396 kBChecksum SHA-512
217739b544983632b9bd3f23a90127bd641be416794448297830b52bc34befc81402f1de6706b9caa453b68150007a8aa892b11b13483c7156396c07ceb40e56
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Kour, Jawdat
By organisation
Department of Computer and Systems Sciences
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 18 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 160 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf