Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Classifying Hate Speech using Fine-tuned Language Models
Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Samhällsvetenskapliga fakulteten, Statistiska institutionen.
2018 (engelsk)Independent thesis Advanced level (degree of Master (Two Years)), 20 poäng / 30 hpOppgave
Abstract [en]

Given the explosion in the size of social media, the amount of hate speech is also growing. To efficiently combat this issue we need reliable and scalable machine learning models. Current solutions rely on crowdsourced datasets that are limited in size, or using training data from self-identified hateful communities, that lacks specificity. In this thesis we introduce a novel semi-supervised modelling strategy. It is first trained on the freely available data from the hateful communities and then fine-tuned to classify hateful tweets from crowdsourced annotated datasets. We show that our model reach state of the art performance with minimal hyper-parameter tuning.

sted, utgiver, år, opplag, sider
2018. , s. 31
Emneord [en]
machine learning, natural language processing, hate speech, transfer learning, semi-supervised learning, recurrent neural networks
HSV kategori
Identifikatorer
URN: urn:nbn:se:uu:diva-352637OAI: oai:DiVA.org:uu-352637DiVA, id: diva2:1214328
Fag / kurs
Statistics
Utdanningsprogram
Master Programme in Statistics
Veileder
Examiner
Tilgjengelig fra: 2018-06-19 Laget: 2018-06-06 Sist oppdatert: 2018-06-19bibliografisk kontrollert

Open Access i DiVA

fulltext(604 kB)518 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 604 kBChecksum SHA-512
b73d34ef6107cd9b98b7f4aad6025914068815ada48e4a716783b160dd2e12cc12252612b30d6e0d93ecab2e359291d03a0049484f74a4a4767cee5ace15df9d
Type fulltextMimetype application/pdf

Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 518 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

urn-nbn

Altmetric

urn-nbn
Totalt: 754 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf