Representing word semantics for IR by continuous functions
2007 (English)In: Studies in Theory of Information Retrieval. Proceedings of the ICTIR07 Conference, Budapest, 18-20 October 2007 / [ed] Sándor Dominich, Ferenc Kiss, Foundation for Information Society, Budapest , 2007, 149-155 p.Conference paper (Refereed)
Information representation is an important but neglected aspect of building text information retrieval models.
In order to be efficient, the mathematical objects of a formal model, like vectors, have to reasonably reproduce
language-related phenomena such as word meaning inherent in index terms. On the other hand, the classical
vector space model, when it comes to the representation of word meaning, is approximative only, whereas it
exactly localizes term, query and document content. It can be shown that by replacing vectors by continuous
functions, information retrieval in Hilbert space yields comparable or better results. This is because according
to the non-classical or continuous vector space model, content cannot be exactly localized. At the same time,
the model relies on a richer representation of word meaning than the VSM can offer.
Place, publisher, year, edition, pages
Foundation for Information Society, Budapest , 2007. 149-155 p.
information retrieval, language representation, word semantics, signal processing, Information retrieval, information representation
Information Studies Specific Languages Atom and Molecular Physics and Optics Mathematics
IdentifiersURN: urn:nbn:se:hb:diva-5855ISI: 000300454800001Local ID: 2320/3229ISBN: 978-963-06-3237-9OAI: oai:DiVA.org:hb-5855DiVA: diva2:886537