Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Extraction of word senses from bilingual resources using graph-based semantic mirroring
Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, The Institute of Technology.
2013 (English)Independent thesis Basic level (degree of Bachelor), 10,5 credits / 16 HE creditsStudent thesisAlternative title
Extraktion av ordbetydelser från tvåspråkiga resurser med grafbaserad semantisk spegling (Swedish)
Abstract [en]

In this thesis we retrieve semantic information that exists implicitly in bilingual data. We gather input data by repeatedly applying the semantic mirroring procedure. The data is then represented by vectors in a large vector space. A resource of synonym clusters is then constructed by performing K-means centroid-based clustering on the vectors. We evaluate the result manually, using dictionaries, and against WordNet, and discuss prospects and applications of this method.

Abstract [sv]

I det här arbetet utvinner vi semantisk information som existerar implicit i tvåspråkig data. Vi samlar indata genom att upprepa proceduren semantisk spegling. Datan representeras som vektorer i en stor vektorrymd. Vi bygger sedan en resurs med synonymkluster genom att applicera K-means-algoritmen på vektorerna. Vi granskar resultatet för hand med hjälp av ordböcker, och mot WordNet, och diskuterar möjligheter och tillämpningar för metoden.

Place, publisher, year, edition, pages
2013. , 42 p.
Keyword [en]
computational linguistics, natural language processing, data mining, word sense discrimination, semantic mirroring, vector space modeling, cluster analysis
Keyword [sv]
datorlingvistik, språkteknologi, data mining, semantisk spegling, ordbetydelser, vektorrymdsmodeller
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:liu:diva-91880ISRN: LIU-IDA/LITH-EX-G--13/008--SEOAI: oai:DiVA.org:liu-91880DiVA: diva2:619502
Subject / course
Computer and information science at the Institute of Technology
Presentation
2013-04-05, John von Neumann, B-huset, Campus Valla, Linköping, 13:24 (Swedish)
Uppsok
Technology
Supervisors
Examiners
Available from: 2013-05-06 Created: 2013-05-03 Last updated: 2013-05-06Bibliographically approved

Open Access in DiVA

fulltext(711 kB)360 downloads
File information
File name FULLTEXT01.pdfFile size 711 kBChecksum SHA-512
dce5fcd2566526b96137d98b0fb7de5d6dbab861396ea6960427f3264c24a056383ce503bc39e0649bb096b65e468b0e57077b6cddd1f9bde7062d09f23066a9
Type fulltextMimetype application/pdf

By organisation
Human-Centered systemsThe Institute of Technology
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 360 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 391 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf