Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
APPLICATIONS OF DEEP LEARNING IN TEXT CLASSIFICATION FOR HIGHLY MULTICLASS DATA
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Social Sciences, Department of Statistics.
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Text classification using deep learning is rarely applied to tasks with more than ten target classes. This thesis investigates if deep learning can be successfully applied to a task with over 1000 target classes. A pretrained Long Short-Term Memory language model is fine-tuned and used as a base for the classifier. After five days of training, the deep learning model achieves 80.5% accuracy on a publicly available dataset, 9.3% higher than Naive Bayes. With five guesses, the model predicts the correct class 92.2% of the time.

Place, publisher, year, edition, pages
2019.
Keywords [en]
ULMFiT, Neural Networks, NLP, LSTM, Transfer Learning
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:uu:diva-385162OAI: oai:DiVA.org:uu-385162DiVA, id: diva2:1323153
Subject / course
Statistics
Educational program
Master Programme in Statistics
Supervisors
Examiners
Available from: 2019-06-18 Created: 2019-06-11 Last updated: 2019-06-18Bibliographically approved

Open Access in DiVA

fulltext(2057 kB)122 downloads
File information
File name FULLTEXT01.pdfFile size 2057 kBChecksum SHA-512
4a1b97de536c6f59cb0e8ecbdab9c9b62990dd6425f812887d38ecabd562051fc79f1e6ae2d8d6188e5f719ed2a3c9e73c5c4757f171249d0fdef14546054377
Type fulltextMimetype application/pdf

By organisation
Department of Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 122 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 101 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf