Change search
ReferencesLink to record
Permanent link

Direct link
Stock trend prediction using news articles: a text mining approach
2007 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Stock market prediction with data mining techniques is one of the most important issues to be investigated. Mining textual documents and time series concurrently, such as predicting the movements of stock prices based on the contents of the news articles, is an emerging topic in data mining and text mining community. Previous researches have shown that there is a strong relationship between the time when the news stories are released and the time when the stock prices fluctuate. In this thesis, we present a model that predicts the changes of stock trend by analyzing the influence of non-quantifiable information namely the news articles which are rich in information and superior to numeric data. In particular, we investigate the immediate impact of news articles on the time series based on the Efficient Markets Hypothesis. This is a binary classification problem which uses several data mining and text mining techniques. For making such a prediction model, we use the intraday prices and the time-stamped news articles related to Iran-Khodro Company for the consecutive years of 1383 and 1384. A new statistical based piecewise segmentation algorithm is proposed to identify trends on the time series. The news articles are preprocessed and are labeled either as rise or drop by being aligned back to the segmented trends. A document selection heuristics that is based on the chi-square estimation is used for selecting the positive training documents. The selected news articles are represented using the vector space modeling and tfidf term weighting scheme. Finally, the relationship between the contents of the news stories and trends on the stock prices are learned through support vector machine. Different experiments are conducted to evaluate various aspects of the proposed model and encouraging results are obtained in all of the experiments. The accuracy of the prediction model is equal to 83% and in comparison with news random labeling with 51% of accuracy: the model has increased the accuracy by 30%. The prediction model predicts 1.6 times better and more correctly than the news random labeling.

Place, publisher, year, edition, pages
Keyword [en]
Social Behaviour Law, Text Mining, Stock Trend Prediction, Support Vector Machines
Keyword [sv]
Samhälls-, beteendevetenskap, juridik
URN: urn:nbn:se:ltu:diva-46064ISRN: LTU-PB-EX--07/071--SELocal ID: 3b47481b-2458-4498-9524-7872a2b3ab8aOAI: diva2:1019373
Subject / course
Student thesis, at least 30 credits
Educational program
Electronic Commerce, master's level
Validerat; 20101217 (root)Available from: 2016-10-04 Created: 2016-10-04Bibliographically approved

Open Access in DiVA

fulltext(1936 kB)8 downloads
File information
File name FULLTEXT01.pdfFile size 1936 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search outside of DiVA

GoogleGoogle Scholar
Total: 8 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 1 hits
ReferencesLink to record
Permanent link

Direct link