Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Using Morphological Analysis in an Information Retrieval System for Résumés
KTH, School of Computer Science and Communication (CSC).
2016 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Användning av morfologisk analys i ett informationssökningssystem för CVn (Swedish)
Abstract [en]

This thesis investigates the usage of an information retrieval system among résumés in Swedish and how the usage of morphological methods, such as lemmatization, affects the results. In order to investigate this, a small information retrieval system was built using lemmatization and compound splitting. This thesis also discusses how the relevance of a résumé can be decided and evaluates the information retrieval system in terms of precision, recall and ranking ability.  The results show that using morphological analysis had a positive effect in some cases, especially when the query contained more Swedish words than names of skills. In the cases where there were mostly technical skills in the query it proved to have a negative impact. Lemmatization was the method that had a small positive effect on ranking ability but the compound splitting had a negative impact regardless on the queries' features.

Abstract [sv]

I detta examensarbete undersöks hur användning av morfologisk analys, så som lemmatisering, påverkar prestandan hos ett informationssökningssystem för CV:n på svenska. Det tas också upp hur relevans hos ett CV kan bedömas och informationssökningssystemet utvärderas utifrån precision och täckning men även ''discounted cumulative gain'' vilket är ett mått på rankningsförmåga. Resultaten visar att morfologisk analys ger positiva effekter i de fall då frågan till söksystemet innehåller många svenska ord. När frågan innehöll många namn på olika tekniker så visade det sig vara negativt att använda morfologi, framförallt när det gäller uppdelning av sammansatta ord. Lemmatisering var den metod som hade positiv effekt i vissa fall medan uppdelning av sammansatta ord endast hade en negativ effekt.

Place, publisher, year, edition, pages
2016.
Keyword [en]
résumé, information retrieval, morphology, Swedish
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-189578OAI: oai:DiVA.org:kth-189578DiVA: diva2:947155
External cooperation
Netlight Consulting AB
Subject / course
Computer Science
Educational program
Master of Science in Engineering - Media Technology
Supervisors
Examiners
Available from: 2016-07-07 Created: 2016-07-07 Last updated: 2016-07-07Bibliographically approved

Open Access in DiVA

fulltext(809 kB)115 downloads
File information
File name FULLTEXT01.pdfFile size 809 kBChecksum SHA-512
ee82549cb73efd32d19f5cc51ccd9ca748e53224085841e33b1251e343156753c924972ea1e580c222916120d225e8a24743ae3650c9c079b8b141958b5a940b
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 115 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 188 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf