Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Sarcasm Detection with TensorFlow
KTH, School of Electrical Engineering and Computer Science (EECS).
KTH, School of Electrical Engineering and Computer Science (EECS).
2018 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesisAlternative title
Upptäcka sarkasm med TensorFlow (Swedish)
Abstract [en]

Sentiment analysis is the process of letting a computer guess the senti- ment of someone towards something based on a text. This can among other things be useful in marketing, for example in the case of the computer figuring out that a certain person likes a certain product it can present ads for similar products to the person. Sentiment analy- sis in social media is when the texts analyzed are from a social media context like comments or posts on Twitter, Facebook, etc. One prob- lematic aspect of these texts is sarcasm. People tend to be sarcastic very often in social media, with sarcasm being something that can be hard to detect even for a human this does cause problems for the com- puter. This study was conducted with the intention of investigating how sarcasm detection can be performed in social media texts with the help of machine learning. For this purpose Google’s machine learning framework for Python, TensorFlow, was utilized. The machine learn- ing model created was a deep neural network with two hidden layers containing ten nodes each. As for the input a dataset of 4692 texts were used with a 80/20 training/testing split. For preprocessing the texts into a more suitable form for TensorFlow the methods Bag of Words, Bigrams and a naive method here refered to as Char for Char were con- sidered. However due to time constraints proper results from the more advanced approaches (Bigrams and Bag of Words) were not achieved. It was at least found that the rather simple approach was better than expected, with results notably better than 50% that would be highly unlikely to achieve through sheer luck.

Abstract [sv]

Sentimentanalys är när en dator får till uppgift att gissa vad någon tycker on någonting baserat på en text. Detta kan bland annat vara användbart för marknadsföring, till exempel i fallet då en dator listat ut att en person tycker om en produkt kan den visa personen annonser för liknande produkter. Sentiment analys i sociala medier är när texterna som analyseras är från sociala medier, som inlägg och kommentarer från facebook, twitter, etc. En problematisk aspekt av dessa texter är sarkasm. Folk tenderar att vara sarkastiska ofta i sociala medier, samtidigt som sarkasm kan vara svårt att upptäcka även för en människa som läser texten. Denna studie genomfördes med avsikten att undersöka hur sarkasmdetektion kan genomföras på texter från sociala medier med hjälp av maskininlärning. För det syftet användes Googles maksininlärnings ramverk för Python: TensorFlow. Maskininlärningsmodellen som skapades med hjälp av ramverket var ett deep neural network med två hidden layers som består av tio noder var. För input användes ett dataset på 4692 texter med en 80/20 tränings/testnings split. För att omvandla texterna till en form som är kompatibel med TensorFlow togs metoderna Bag of Words, Bigrams, och en naiv metod här kallad Char for Char i beaktande. Tyvärr ledde brist på tid till att ordentliga resultat från de mer avancerade metoderna Bag of Words och Bigrams inte uppnådes. Däremot så ledde den naiva metoden till resultat som skiljer sig markant från 50% och som skulle vara extremt osannolika att uppnå genom ren tur.

Place, publisher, year, edition, pages
2018.
Series
TRITA-EECS-EX ; 2018:195
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-229768OAI: oai:DiVA.org:kth-229768DiVA, id: diva2:1214412
Subject / course
Computer Science
Supervisors
Examiners
Available from: 2018-06-20 Created: 2018-06-06 Last updated: 2018-06-20Bibliographically approved

Open Access in DiVA

fulltext(744 kB)71 downloads
File information
File name FULLTEXT02.pdfFile size 744 kBChecksum SHA-512
fe858c2ad4c4e4081ee30e22c5ec0d0cfd84bb139b44ec0b6caac0a6d90651532c1839fe6b1154142fbf4373b581ec994ef74171785b9cf62ac66ea925046a4e
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 71 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 278 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf