Change search
ReferencesLink to record
Permanent link

Direct link
Viewpoint and Topic Modeling of Current Events
KTH, School of Computer Science and Communication (CSC).
2016 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

There are multiple sides to every story, and while statistical topic models have been highly successful at topically summarizing the stories in corpora of text documents, they do not explicitly address the issue of learning the different sides, the viewpoints, expressed in the documents. In this paper, we show how these viewpoints can be learned completely unsupervised and represented in a human interpretable form. We use a novel approach of applying CorrLDA2 for this purpose, which learns topic-viewpoint relations that can be used to form groups of topics, where each group represents a viewpoint. A corpus of documents about the Israeli-Palestinian conflict is then used to demonstrate how a Palestinian and an Israeli viewpoint can be learned. By leveraging the magnitudes and signs of the feature weights of a linear SVM, we introduce a principled method to evaluate associations between topics and viewpoints. With this, we demonstrate, both quantitatively and qualitatively, that the learned topic groups are contextually coherent, and form consistently correct topic-viewpoint associations.

Abstract [sv]

I detta kandidatexamensarbete demonstrerar vi hur åsikter som uttrycks i artiklar om aktuella händelser kan modeleras med en oövervakad inlärningsmetod. Vi anpassar CorrLDA2-modellen för detta syfte, som kan lära sig vilka ämnen som diskuteras i en samling av textdokument, vilka åsikter som uttrycks, samt relationer mellan ämnen och åsikter. Med hjälp av dessa relationer kan vi sedan bilda grupper av ämnen, där varje grupp är associerad med en åsikt. Detta skapar en representation av åsikter som är tolkbar för människor. Vi demonstrerar detta med hjälp av en samling av dokument som handlar om Israel-Palestinakonflikten, genom att bilda en grupp av ämnen som representerar den palestinska åsikten, samt en grupp som representerar den isrealiska åsikten. Vi introducerar sedan en ny evalueringsmetod, som använder sig av magnituden samt tecknen på attributsvikter från en linjär SVM. Med hjälp av detta visar vi, både kvantitativt och kvalitativt, att de inlärda relationerna mellan ämenen och åsikter bildar sammanhängande ämnesgrupper, samt konsikvent korrekta associationer mellan ämnen och åsikter.

Place, publisher, year, edition, pages
2016.
Keyword [en]
viewpoint topic model
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-190083OAI: oai:DiVA.org:kth-190083DiVA: diva2:951119
Supervisors
Examiners
Note

This is the second time I am submitting my thesis here on DiVa.

I didn't attach the actual thesis document (i.e. the pdf file) last time because we were submitting on for publication in a scientific conference and I wanted to respect the double blind review process and not publish anything before.

Now, I want to publish the thesis document here on DiVa.

Available from: 2016-08-22 Created: 2016-08-05 Last updated: 2016-08-22Bibliographically approved

Open Access in DiVA

Viewpoint and Topic Modeling of Current Events(659 kB)4 downloads
File information
File name FULLTEXT01.pdfFile size 659 kBChecksum SHA-512
25c0da01004f3ed77078e59f74420c4b4d39b73d888d467449ca1956080e101ec682e41db854603f0ed53de76088ef0b6b8e38bece12b5e82d714a3619450388
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 4 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 8 hits
ReferencesLink to record
Permanent link

Direct link