Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Stock trend prediction using news articles: a text mining approach
2007 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Stock market prediction with data mining techniques is one of the most important issues to be investigated. Mining textual documents and time series concurrently, such as predicting the movements of stock prices based on the contents of the news articles, is an emerging topic in data mining and text mining community. Previous researches have shown that there is a strong relationship between the time when the news stories are released and the time when the stock prices fluctuate. In this thesis, we present a model that predicts the changes of stock trend by analyzing the influence of non-quantifiable information namely the news articles which are rich in information and superior to numeric data. In particular, we investigate the immediate impact of news articles on the time series based on the Efficient Markets Hypothesis. This is a binary classification problem which uses several data mining and text mining techniques. For making such a prediction model, we use the intraday prices and the time-stamped news articles related to Iran-Khodro Company for the consecutive years of 1383 and 1384. A new statistical based piecewise segmentation algorithm is proposed to identify trends on the time series. The news articles are preprocessed and are labeled either as rise or drop by being aligned back to the segmented trends. A document selection heuristics that is based on the chi-square estimation is used for selecting the positive training documents. The selected news articles are represented using the vector space modeling and tfidf term weighting scheme. Finally, the relationship between the contents of the news stories and trends on the stock prices are learned through support vector machine. Different experiments are conducted to evaluate various aspects of the proposed model and encouraging results are obtained in all of the experiments. The accuracy of the prediction model is equal to 83% and in comparison with news random labeling with 51% of accuracy: the model has increased the accuracy by 30%. The prediction model predicts 1.6 times better and more correctly than the news random labeling.

Place, publisher, year, edition, pages
2007.
Keyword [en]
Social Behaviour Law, Text Mining, Stock Trend Prediction, Support Vector Machines
Keyword [sv]
Samhälls-, beteendevetenskap, juridik
Identifiers
URN: urn:nbn:se:ltu:diva-46064ISRN: LTU-PB-EX--07/071--SELocal ID: 3b47481b-2458-4498-9524-7872a2b3ab8aOAI: oai:DiVA.org:ltu-46064DiVA: diva2:1019373
Subject / course
Student thesis, at least 30 credits
Educational program
Electronic Commerce, master's level
Examiners
Note
Validerat; 20101217 (root)Available from: 2016-10-04 Created: 2016-10-04Bibliographically approved

Open Access in DiVA

fulltext(1936 kB)1453 downloads
File information
File name FULLTEXT01.pdfFile size 1936 kBChecksum SHA-512
58418e4431d773642d9b0d9c8bc3b8ce30abb802cfddfdad30bb85ed3fa21a47133c1ef69aaec238af6724064c1356a15eec8d0c1c1e67e96d46d9aaea679a71
Type fulltextMimetype application/pdf

Search outside of DiVA

GoogleGoogle Scholar
Total: 1453 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 795 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf