Change search
ReferencesLink to record
Permanent link

Direct link
Detecting Trends on Twitter: The Effect of Unsupervised Pre-Training
KTH, School of Computer Science and Communication (CSC).
KTH, School of Engineering Sciences (SCI).
2016 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesisAlternative title
Hitta Twittertrender : Effekten av oövervakad förträning (Swedish)
Abstract [en]

Unsupervised pre-training has recently emerged as a method for initializing super- vised machine learning methods. Foremost it has been applied to artificial neural networks (ANN). Previous work has found unsupervised pre-training to increase accuracy and be an effective method of initialization for ANNs[2].

This report studies the effect of unsupervised pre-training when detecting Twit- ter trends. A Twitter trend is defined as a topic gaining popularity.

Previous work has studied several machine learning methods to analyse Twitter trends. However, this thesis studies the efficiency of using a multi-layer percep- tron classifier (MLPC) with and without Bernoulli restricted Boltzmann machine (BRBM) as an unsupervised pre-training method. Two relevant factors studied are the number of hidden layers in the MLPC and the size of the available dataset for training the methods.

This thesis has implemented a MLPC that can detect trends at an accuracy of 85%. However, the experiments conducted to test the effect of unsupervised pre-training were inconclusive. No benefit could be concluded when using BRBM pre-training for the Twitter time series data. 

Abstract [sv]

Oövervakade förträning (OF) är ett område inom maskininlärning som används för att initialisera övervakade metoder. Tidigare studier har visat på att OF har varit en effektiv metod för att initialisera artificiella neurala nätverk (ANN). Denna initialiseringsmetod har haft positiv inverkan på den övervakade metodens precision[2].

Denna rapport studerar OFs påverkan när en övervakad metod används för att hitta trender i ett Twitterdataset. En Twittertrend definieras av ett ämnes ökning i popularitet.

Tidigare har flera studier analyserat olika maksininlärnings metoders applicerbarhet på Twitter tidsserie data. Dock har ingen studie fokuserat på användningen av OF och ANNs på denna typ av data, något denna rapport ämnar göra. Effekten av att kombinera en Bernoulli restricted Boltzmann machine (BRBM) med en multi-layer perceptron classifier (MLPC) jämförs med en modell vilken endast använder MLCP. Två relevanta faktorer som också studeras är hur storleken på datasetet som tränar metoderna påverkar deras relativa precision, samt hur antalet gömda lager i MLPC påverkar respektive metod.

Denna studie har implementerat en MLPC som kan hitta trender med 85% säkerhet. Dock har experimenten för OF inte lyckats bekräfta någon fördel med OF vid tillämpning på Twitter tidsserie data. 

Place, publisher, year, edition, pages
2016. , 39 p.
National Category
Engineering and Technology Computer Science
URN: urn:nbn:se:kth:diva-186544OAI: diva2:927736
Subject / course
Computer Technology and Software Engineering
Educational program
Master of Science in Engineering - Computer Science and Technology
2016-06-17, 05:06 (English)

The presentation has not yet been scheduled but will take place in June and in the locales of KTH.

Available from: 2016-05-18 Created: 2016-05-13 Last updated: 2016-05-18Bibliographically approved

Open Access in DiVA

fulltext(711 kB)19 downloads
File information
File name FULLTEXT01.pdfFile size 711 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)School of Engineering Sciences (SCI)
Engineering and TechnologyComputer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 19 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 41 hits
ReferencesLink to record
Permanent link

Direct link