Change search
ReferencesLink to record
Permanent link

Direct link
Definition Extraction From Swedish Technical Documentation: Bridging the gap between industry and academy approaches
Linköping University, Department of Computer and Information Science.
2016 (English)Independent thesis Basic level (degree of Bachelor), 12 credits / 18 HE creditsStudent thesis
Abstract [en]

Terminology is concerned with the creation and maintenance of concept systems, terms and definitions. Automatic term and definition extraction is used to simplify this otherwise manual and sometimes tedious process. This thesis presents an integrated approach of pattern matching and machine learning, utilising feature vectors in which each feature is a Boolean function of a regular expression. The integrated approach is compared with the two more classic approaches, showing a significant increase in recall while maintaining a comparable precision score. Less promising is the negative correlation between the performance of the integrated approach and training size. Further research is suggested.

Place, publisher, year, edition, pages
2016. , 30 p.
Keyword [en]
definition extraction, machine learning, pattern matching, naive bayes, regular expressions, rev, classifier, terminology, comparison
Keyword [sv]
definitionsextraktion, maskininlärning, mönstermatchning, reguljära uttryck, rev, klassificerare, terminologi, jämförelse
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:liu:diva-131057ISRN: LIU-IDA/KOGVET-G--16/024--SEOAI: oai:DiVA.org:liu-131057DiVA: diva2:968018
External cooperation
Fodina Language Technology
Subject / course
Cognitive science
Supervisors
Examiners
Available from: 2016-09-14 Created: 2016-09-06 Last updated: 2016-09-15Bibliographically approved

Open Access in DiVA

Definition Extraction From Swedish Technical Documentation: Bridging the gap between industry and academy approaches(787 kB)26 downloads
File information
File name FULLTEXT02.pdfFile size 787 kBChecksum SHA-512
59415a847151fef8d8478856be24ca0f01e53ca7acae303d72ed9aae76d133bbd562de720ab13dd0d5db026f6825d626a4add2e2a9d198d6173ac95408d0d3db
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Helmersson, Benjamin
By organisation
Department of Computer and Information Science
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 28 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 1046 hits
ReferencesLink to record
Permanent link

Direct link