Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Compound terms and their constituent elements in information retrieval
RISE, Swedish ICT, SICS.ORCID-id: 0000-0003-4042-4919
Rekke forfattare: 12005 (engelsk)Konferansepaper, Poster (with or without abstract) (Fagfellevurdert)
Abstract [en]

Compounds, especially in languages where compounds are formed by concatenation without intervening whitespace between elements, pose challenges to simple text retrieval algorithms. Search queries that include compounds may not retrieve texts where elements of those compounds occur in uncompounded form; search queries that lack compounds will not retrieve texts where the salient elements are buried inside compounds. This study explores the distributional characteristics of compounds and their constituent elements using Swedish, a compounding language, as a test case. The compounds studied are taken from experimental search topics given for CLEF, the Cross-Language Evaluation Forum and their distributions are related to relevance assessments made on the collection under study and evaluated in terms of divergence from expected random distribution over documents. The observations made have direct ramifications on e.g. query analysis and term weighting approaches in information retrieval system design.

sted, utgiver, år, opplag, sider
2005, 1.
HSV kategori
Identifikatorer
URN: urn:nbn:se:ri:diva-20956OAI: oai:DiVA.org:ri-20956DiVA, id: diva2:1040990
Konferanse
15th Nordic Conference of Computational Linguistics
Tilgjengelig fra: 2016-10-31 Laget: 2016-10-31 Sist oppdatert: 2018-03-08bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Søk i DiVA

Av forfatter/redaktør
Karlgren, Jussi
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric

urn-nbn
Totalt: 274 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf