Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Extracting Customer Sentiments from Email Support Tickets: A case for email support ticket prioritisation
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Background

Daily, companies generate enormous amounts of customer support tickets which are grouped and placed in specialised queues, based on some characteristics, from where they are resolved by the customer support personnel (CSP) on a first-in-first-out basis. Given that these tickets require different levels of urgency, a logical next step to improving the effectiveness of the CSPs is to prioritise the tickets based on business policies. Among the several heuristics that can be used in prioritising tickets is sentiment polarity.

Objectives

This study investigates how machine learning methods and natural language techniques can be leveraged to automatically predict the sentiment polarity of customer support tickets using.

Methods

Using a formal experiment, the study examines how well Support Vector Machine (SVM), Naive Bayes (NB) and Logistic Regression (LR) based sentiment polarity prediction models built for the product and movie reviews, can be used to make sentiment predictions on email support tickets. Due to the limited size of annotated email support tickets, Valence Aware Dictionary and sEntiment Reasoner (VADER) and cluster ensemble - using k-means, affinity propagation and spectral clustering, is investigated for making sentiment polarity prediction.

Results

Compared to NB and LR, SVM performs better, scoring an average f1-score of .71 whereas NB scores least with a .62 f1-score. SVM, combined with the presence vector, outperformed the frequency and TF-IDF vectors with an f1-score of .73 while NB records an f1-score of .63. Given an average f1-score of .23, the models transferred from the movie and product reviews performed inadequately even when compared with a dummy classifier with an f1-score average of .55. Finally, the cluster ensemble method outperformed VADER with an f1-score of .61 and .53 respectively.

Conclusions

Given the results, SVM, combined with a presence vector of bigrams and trigrams is a candidate solution for extracting sentiments from email support tickets. Additionally, transferring sentiment models from the movie and product reviews domain to the email support tickets is not possible. Finally, given that there exists a limited dataset for conducting sentiment analysis studies in the Swedish and the customer support context, a cluster ensemble is recommended as a sample selection method for generating annotated data.

Place, publisher, year, edition, pages
2019. , p. 62
Keywords [en]
Machine Learning, Natural Language Processing, Sentiment Analysis, Cluster Ensemble, VADER, Customer support
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:bth-18853OAI: oai:DiVA.org:bth-18853DiVA, id: diva2:1375069
External cooperation
Telenor AB, Swden, Karlskrona
Subject / course
DV2572 Master´s Thesis in Computer Science
Educational program
DVACS Master of Science Programme in Computer Science
Presentation
2019-09-26, 07:01 (English)
Supervisors
Examiners
Available from: 2019-12-04 Created: 2019-12-04 Last updated: 2019-12-04Bibliographically approved

Open Access in DiVA

Extracting Customer Sentiments(2083 kB)18 downloads
File information
File name FULLTEXT02.pdfFile size 2083 kBChecksum SHA-512
70148ee9dd5584da5f2cd583e95c53c65a6b2c459ff202753f93dba6f9782e39b4bc4f1e60492a51508526e502a79fdfca3e524254377c5fb08bb4b295699fb6
Type fulltextMimetype application/pdf

By organisation
Department of Computer Science
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 18 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 75 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf