Change search
ReferencesLink to record
Permanent link

Direct link
Applications of data mining algorithms to analysis of medical data.
Blekinge Institute of Technology, School of Engineering, Department of Systems and Software Engineering.
2007 (English)Independent thesis Advanced level (degree of Master (One Year))Student thesis
Abstract [en]

Medical datasets have reached enormous capacities. This data may contain valuable information that awaits extraction. The knowledge may be encapsulated in various patterns and regularities that may be hidden in the data. Such knowledge may prove to be priceless in future medical decision making. The data which is analyzed comes from the Polish National Breast Cancer Prevention Program ran in Poland in 2006. The aim of this master's thesis is the evaluation of the analytical data from the Program to see if the domain can be a subject to data mining. The next step is to evaluate several data mining methods with respect to their applicability to the given data. This is to show which of the techniques are particularly usable for the given dataset. Finally, the research aims at extracting some tangible medical knowledge from the set. The research utilizes a data warehouse to store the data. The data is assessed via the ETL process. The performance of the data mining models is measured with the use of the lift charts and confusion (classification) matrices. The medical knowledge is extracted based on the indications of the majority of the models. The experiments are conducted in the Microsoft SQL Server 2005. The results of the analyses have shown that the Program did not deliver good-quality data. A lot of missing values and various discrepancies make it especially difficult to build good models and draw any medical conclusions. It is very hard to unequivocally decide which is particularly suitable for the given data. It is advisable to test a set of methods prior to their application in real systems. The data mining models were not unanimous about patterns in the data. Thus the medical knowledge is not certain and requires verification from the medical people. However, most of the models strongly associated patient's age, tissue type, hormonal therapies and disease in family with the malignancy of cancers. The next step of the research is to present the findings to the medical people for verification. In the future the outcomes may constitute a good background for development of a Medical Decision Support System.

Place, publisher, year, edition, pages
2007. , 104 p.
Keyword [en]
medical data mining, medical data warehouse, medical data, breast cancer.
National Category
Computer Science Software Engineering
URN: urn:nbn:se:bth-4253Local ID: diva2:831582
Available from: 2015-04-22 Created: 2007-10-16 Last updated: 2015-06-30Bibliographically approved

Open Access in DiVA

fulltext(2635 kB)107 downloads
File information
File name FULLTEXT01.pdfFile size 2635 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
Department of Systems and Software Engineering
Computer ScienceSoftware Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 107 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 91 hits
ReferencesLink to record
Permanent link

Direct link