Change search
ReferencesLink to record
Permanent link

Direct link
Text Mining of News Articles for Stock Price Predictions
Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, Department of Computer and Information Science.
2011 (English)MasteroppgaveStudent thesis
Abstract [en]

This thesis investigates the prediction of possible stock price changes immediately after news article publications, by automatic analysis of these news articles. Some background information about financial trading theory and text mining is given in addition to an overview of earlier related research in the field of automatic analyzes of news articles for predicting future stock prices. In this thesis a system is designed and implemented to predict stock price trends for the time immediately after the publication of news articles. This system consists mainly of four components. The first component gathers news articles and stock prices automatically from internet. The second component prepares the news articles by sending them to some document preprocessing steps and finding relevant features before they are sent to a document representation process. The third component categorizes the news articles into predefined categories, and finally the fourth component applies appropriate trading strategies depending on the category of the news article. This system requires a labeled data set to train the categorization component. This data set is labeled automatically on the basis of the price trends directly after the news article publication. An additional label refining step using clustering is added in an attempt to improve the labels given by the basic method of labeling by price trends. The findings indicate that a categorization of news articles provides additional information that can be used to forecast stock price trends. Experiments showed that the label refining method greatly improves the performance of the system. It was also shown that the timing of when to start the price trends used to label the data sets had a significant impact on the results. Trading simulations performed with the systems managed to gain positive returns (profits) on most of its trades. Some of the methods also managed to give better results than what trades performed with the manually labeled data set did.

Place, publisher, year, edition, pages
Institutt for datateknikk og informasjonsvitenskap , 2011. , 82 p.
Keyword [no]
ntnudaim:6012, MTDT datateknikk, Intelligente systemer
URN: urn:nbn:no:ntnu:diva-13573Local ID: ntnudaim:6012OAI: diva2:440508
Available from: 2011-09-13 Created: 2011-09-13

Open Access in DiVA

fulltext(1830 kB)12072 downloads
File information
File name FULLTEXT01.pdfFile size 1830 kBChecksum SHA-512
Type fulltextMimetype application/pdf
cover(47 kB)123 downloads
File information
File name COVER01.pdfFile size 47 kBChecksum SHA-512
Type coverMimetype application/pdf
attachment(683 kB)622 downloads
File information
File name ATTACHMENT01.zipFile size 683 kBChecksum SHA-512
Type attachmentMimetype application/zip

By organisation
Department of Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 12072 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 1109 hits
ReferencesLink to record
Permanent link

Direct link