Change search
ReferencesLink to record
Permanent link

Direct link
Stylistic Experiments for Information Retrieval
Stockholm University, SICS.ORCID iD: 0000-0003-4042-4919
2000 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

Information retrieval systems are built to handle texts as topical items:texts are tabulated by occurrence frequencies of content words in them,under the assumption that text topic is reasonably well modeled by contentword occurrence. But texts have several interesting characteristics beyondtopic. The experiments described in this text investigate {\em stylisticvariation}. Roughly put, style is the difference between two ways of sayingthe same thing --- and systematic stylistic variation can be used tocharacterize the {\em genre} of documents. These experiments investigate ifstylistic information is distinguishable using simple language engineeringmethods, and if in that case this type of information can be used toimprove information retrieval systems.

A first set of experiments shows that simple measures of stylisticvariation can be used to distinguish genres from each other quiteadequately; how well depends on what the genres in question are.

A second set of experiments evaluates the utility of stylistic measures forthe purposes of information retrieval, to identify common characteristicsof relevant and non-relevant documents. The conclusion is that the requestsfor information as typically expressed to retrieval systems are too terseand inspecific for non-topical information to improve retrieval results.Systems for information access need to be designed from the beginning tohandle richer information about the texts and documents at hand:information about stylistic variation cannot easily be added to an existingsystem.

A third set of experiments explores how an interactive system can bedesigned to incorporate stylistic information in the interface between userand system. These experiments resulted in the design an interface forcategorizing retrieval results by genre, and displaying the retrievalresults using this categorization. This interface is integrated into aprototype for retrieving information from the World Wide Web.

Place, publisher, year, edition, pages
Stockholm: Department of Linguistics, Stockholm University , 2000. , 130 p.
National Category
General Language Studies and Linguistics
URN: urn:nbn:se:kth:diva-187749ISBN: 91-7265-058-3OAI: diva2:931529

QC 20160530

Available from: 2016-05-30 Created: 2016-05-28 Last updated: 2016-05-30Bibliographically approved

Open Access in DiVA

fulltext(4565 kB)8 downloads
File information
File name FULLTEXT01.pdfFile size 4565 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Search in DiVA

By author/editor
Karlgren, Jussi
General Language Studies and Linguistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 8 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 15 hits
ReferencesLink to record
Permanent link

Direct link