Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Using Machine Learning to Categorize Documents in a Construction Project
Linnaeus University, Faculty of Technology, Department of computer science and media technology (CM).
2019 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Automation of document handling in the construction industries could save large amounts of time, effort and money and classifying a document is an important step in that automation. In the field of machine learning, lots of research have been done on perfecting the algorithms and techniques, but there are many areas where those techniques could be used that has not yet been studied. In this study I looked at how effectively the machine learning algorithm multinomial Naïve-Bayes would be able to classify 1427 documents split up into 19 different categories from a construction project. The experiment achieved an accuracy of 92.7% and the paper discusses some of the ways that accuracy can be improved. However, data extraction proved to be a bottleneck and only 66% of the original documents could be used for testing the classifier.

Place, publisher, year, edition, pages
2019. , p. 24
Keywords [en]
Machine learning, multinomial Naïve-Bayes, construction industry, document classification
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:lnu:diva-90091OAI: oai:DiVA.org:lnu-90091DiVA, id: diva2:1370362
Subject / course
Computer Science
Educational program
Datavetenskap, kandidatprogram, 60 hp
Supervisors
Examiners
Available from: 2019-11-15 Created: 2019-11-14 Last updated: 2019-11-15Bibliographically approved

Open Access in DiVA

fulltext(1269 kB)0 downloads
File information
File name FULLTEXT01.pdfFile size 1269 kBChecksum SHA-512
2926b6fc246d7c8db8eb0df45cd0b55bd4a9b0f392265587c3dc6ff0a97c1e96f89f7b34eeef8f4257aaa06280c727386d62703102cd2eeef7f273de78ae66b7
Type fulltextMimetype application/pdf

By organisation
Department of computer science and media technology (CM)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 3 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf