Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Online inference of topics: Implementation of the topic model Latent Dirichlet Allocation using an online variational bayes inference algorithm to sort news articles
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
2014 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The client of the project has problems with complex queries and noisewhen querying their stream of five million news articles per day. Thisresults in much manual work when sorting and pruning the search result of their query. Instead of using direct text matching, the approachof the project was to use a topic model to describe articles in terms oftopics covered and to use this new information to sort the articles.

An online version of the topic model Latent Dirichlet Allocationwas implemented using online variational Bayes inference to handlestreamed data. Using 100 dimensions, topics such as sports and politics emerged during training on a 1.7 million articles big simulatedstream. These topics were used to sort articles based on context. Theimplementation was found accurate enough to be useful for the client aswell as fast and stable enough to be a feasible solution to the problem.

Place, publisher, year, edition, pages
2014.
Series
UPTEC F, ISSN 1401-5757 ; 14010
National Category
Computer and Information Science
Identifiers
URN: urn:nbn:se:uu:diva-222429OAI: oai:DiVA.org:uu-222429DiVA: diva2:712454
External cooperation
The Loop54 Group AB
Educational program
Master Programme in Engineering Physics
Supervisors
Examiners
Available from: 2014-04-25 Created: 2014-04-10 Last updated: 2014-04-25Bibliographically approved

Open Access in DiVA

fulltext(1375 kB)380 downloads
File information
File name FULLTEXT01.pdfFile size 1375 kBChecksum SHA-512
a8ff9486fe90927cb8194014d82fb3172abead1734d0381af0ab7e96bea1f50bf91f53bc37ad48318a36b69729b79bcbcb02e4c2879f6bc6c0ec01ffabee869e
Type fulltextMimetype application/pdf

By organisation
Department of Information Technology
Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 383 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 741 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf