Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Prediction of Factors Influencing Rats Tuberculosis Detection Performance Using Data Mining Techniques
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Social Sciences, Department of Informatics and Media.
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This thesis aimed to predict the factors that influence rats TB detection performance using data mining techniques. A rats TB detection performance dataset was given from APOPO TB training and research center in Morogoro, Tanzania. After data preprocessing, the size of the dataset was 471,133 rats TB detection performance observations and a sample size of 4 female rats. However, in the analysis, only 200,000 data observations were used. Based on the CRISP-DM methodology, this thesis used R language as a data mining tool to analyze the given data. To build the predictive model the classification technique was used to predict the influencing factors and classify rats using a decision tree, random forest, and naive Bayes algorithms. The built predictive models were validated with the same test data to check their classification prediction accuracy and to find the best. The results pinpoint that the random forest is the best predictive model with an accuracy of 78.82%. However, the accuracy differences are negligible. When considering the predictive model accuracy (78.78%) and speed (3 seconds) of the decision tree, it is the best predictive model since it has less building time compared to the random forest (154 seconds). Moreover, the results manifest that age is the most significant influencing factor, and rats of ages between 3.1 to 6 years portrayed potentiality in detection performance. The other predicted factors are Session_Completion_Time, Session_Start_Time, and Av_Weight_Per_Year. These results are useful as a reference to rats TB trainers and researchers in rats TB and Information Systems. Further research using other data mining techniques and tools is valuable.

Place, publisher, year, edition, pages
2019. , p. 87
National Category
Social Sciences
Identifiers
URN: urn:nbn:se:uu:diva-385471OAI: oai:DiVA.org:uu-385471DiVA, id: diva2:1324472
Subject / course
Information Systems
Educational program
Master programme in Information Systems
Supervisors
Examiners
Available from: 2019-06-14 Created: 2019-06-13 Last updated: 2019-06-14Bibliographically approved

Open Access in DiVA

Master's Thesis(2835 kB)42 downloads
File information
File name FULLTEXT01.pdfFile size 2835 kBChecksum SHA-512
8d2130f77ab30008b9adcbd08b064c66ab7928992d3ff3b9fde4f188f071e710feab12a61fbe81ef0cfc58540a1fc311ebbb3fc9fffbf40b95ac1a40ab859963
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Jonathan, Joan
By organisation
Department of Informatics and Media
Social Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 42 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 125 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf