Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Classifying Hate Speech using Fine-tuned Language Models
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Social Sciences, Department of Statistics.
2018 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Given the explosion in the size of social media, the amount of hate speech is also growing. To efficiently combat this issue we need reliable and scalable machine learning models. Current solutions rely on crowdsourced datasets that are limited in size, or using training data from self-identified hateful communities, that lacks specificity. In this thesis we introduce a novel semi-supervised modelling strategy. It is first trained on the freely available data from the hateful communities and then fine-tuned to classify hateful tweets from crowdsourced annotated datasets. We show that our model reach state of the art performance with minimal hyper-parameter tuning.

Place, publisher, year, edition, pages
2018. , p. 31
Keywords [en]
machine learning, natural language processing, hate speech, transfer learning, semi-supervised learning, recurrent neural networks
National Category
Language Technology (Computational Linguistics) Probability Theory and Statistics Computer Vision and Robotics (Autonomous Systems)
Identifiers
URN: urn:nbn:se:uu:diva-352637OAI: oai:DiVA.org:uu-352637DiVA, id: diva2:1214328
Subject / course
Statistics
Educational program
Master Programme in Statistics
Supervisors
Examiners
Available from: 2018-06-19 Created: 2018-06-06 Last updated: 2018-06-19Bibliographically approved

Open Access in DiVA

fulltext(604 kB)213 downloads
File information
File name FULLTEXT01.pdfFile size 604 kBChecksum SHA-512
b73d34ef6107cd9b98b7f4aad6025914068815ada48e4a716783b160dd2e12cc12252612b30d6e0d93ecab2e359291d03a0049484f74a4a4767cee5ace15df9d
Type fulltextMimetype application/pdf

By organisation
Department of Statistics
Language Technology (Computational Linguistics)Probability Theory and StatisticsComputer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar
Total: 213 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 394 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf