Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Drug Name Recognition in Reports on Concomitant Medication
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Systems and Control.
2019 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This thesis evaluates if and how drug name recognition can be used to find drug names in verbatims from reports on concomitant medication in clinical trial studies. In clinical trials, reports on concomitant medication are written if a trial participant takes other drugs than the studied drug. This information needs to be coded to a drug reference dictionary. Coded verbatims were used to create the data needed to train the drug name recognition models in this thesis. Labels for where in each verbatim the coded drugs name was, were created using a Levensthein distance. The drug name recognition models were trained and tested on verbatims with labels.

Drug name recognition was performed using a logistic regression model and a bidirectional long short-term memory model. The bidirectional long short-term memory model performed the best result with an F1 score of 82.5% on classifying which words in the verbatims that were drug names. When the results were studied from case to case, they showed that the bidirectional long short-term memory classifications sometimes outperformed labels it was trained on in single word verbatims. The model was also tested on manually labelled golden standard data where it performed an F1-score of 46.4%. The results indicate that a bidirectional long short-term memory model can be implemented for drug name recognition, but that label reliability is an issue in this thesis.

Place, publisher, year, edition, pages
2019. , p. 33
Series
UPTEC STS, ISSN 1650-8319 ; 19034
Keywords [en]
Machine Learning, Natural Language Processing, Drug Name Recognition, Classification
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:uu:diva-388687OAI: oai:DiVA.org:uu-388687DiVA, id: diva2:1334732
External cooperation
Uppsala Monitoring Centre
Educational program
Systems in Technology and Society Programme
Supervisors
Examiners
Available from: 2019-07-03 Created: 2019-07-03 Last updated: 2019-07-03Bibliographically approved

Open Access in DiVA

fulltext(789 kB)21 downloads
File information
File name FULLTEXT01.pdfFile size 789 kBChecksum SHA-512
7ecef8cc867b0760df59128d8af4ff3604588fa280f5824c348765718b8c20148f42cae99c3558e62c42bfc00add0e3ee94e62a80cf5b71206ea6679c586c9b3
Type fulltextMimetype application/pdf

By organisation
Division of Systems and Control
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 21 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 78 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf