Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Sequential Pattern Mining on Electronic Medical Records for Finding Optimal Clinical Pathways
KTH, School of Electrical Engineering and Computer Science (EECS), Software and Computer systems, SCS.
2018 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Electronic Medical Records (EMRs) are digital versions of paper charts, used to record the treatment of different patients in hospitals. Clinical pathways are used as guidelines for how to treat different diseases, determined by observing outcomes from previous treatments. Sequential pattern mining is a version of data mining where the data mined is organized in sequences. It is a common research topic in data mining with many new variations on existing algorithms being introduced frequently. In a previous report, the sequential pattern mining algorithm PrefixSpan was used to mine patterns in EMRs to verify or suggest new clinical pathways. It was found to only be able to verify pathways partially. One of the reasons stated for this was that PrefixSpan was too inefficient to be able to mine at a low enough support to consider some items. In this report CSpan is used instead, since it is supposed to outperform PrefixSpan by up to two orders of magnitude, in order to improve runtime and thereby address the problems mentioned in the previous work. The results show that CSpan did indeed improve the runtime and the algorithm was able to mine at a lower minimum support. However, the output was only barely improved.

Abstract [sv]

Electronic Medical Records (EMRs) är digitala versioner av behandlingshistoriken för patienter på sjukhus. Clinical pathways används som riktlinjer för hur olika sjukdomar borde behandlas, vilka bestäms genom att observera utkomsten av tidigare behandlingar. Sequential pattern mining är en typ av data mining där datan som behandlas är strukturerad i sekvenser. Det är ett vanligt forskningsområde inom data mining där många nya variationer av existerande algoritmer introduceras frekvent. I en tidigare rapport användes sequential pattern mining algoritmen PrefixSpan på EMRs för att verifiera eller föreslå nya clinical pathways. Den kunde dock endast verifiera pathways delvis. En av anledningarna som nämndes för detta var att PrefixSpan var för ineffektiv för att kunna köras med en tillräckligt låg support för att kunna finna vissa åtgärder i en behandling. I den här rapporten används istället CSpan, eftersom den ska överprestera PrefixSpan med upp till två storleksordningar, för att förbättra körningstiden och därmed adressera problemen som nämns i den tidigare rapporten. Resultaten visar att CSpan förbättrade körningstiden och algoritmen kunde köras med lägre support. Däremot blev utdatan knappt förbättrad.

Place, publisher, year, edition, pages
2018. , p. 27
Series
TRITA-EECS-EX ; 2018:133
Keywords [en]
PrefixSpan, CSpan, Sequential Pattern Mining, EMR, Electronic Medical Record, Clinical Pathway
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-230104OAI: oai:DiVA.org:kth-230104DiVA, id: diva2:1216267
External cooperation
Tokyo Institute of Technology, Yokota laboratory
Subject / course
Computer Science
Educational program
Master of Science in Engineering - Computer Science and Technology
Supervisors
Examiners
Available from: 2018-06-18 Created: 2018-06-11 Last updated: 2018-06-18Bibliographically approved

Open Access in DiVA

fulltext(912 kB)12 downloads
File information
File name FULLTEXT01.pdfFile size 912 kBChecksum SHA-512
f2e5d0be1162cfeb1b1391c39f6a9cec712238f5be160368275ab8ec47a62d8d93fbef3b9c1f901af4b43ef9d8b53e64212e38d1b0936c3ba688cb6fc1c7de08
Type fulltextMimetype application/pdf

By organisation
Software and Computer systems, SCS
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 12 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 65 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf