Utilizing linguistic analysis in multiple source search engines
Modern search engines have several data sources available to users, e.g. News
search, Image search and Video search. When a user enters a query in a search
engine, it is up to the user to choose a different source than the normal web search.
On average, a user will only consider the first few occurrences in a search result and
do so in a few seconds. It would therefore be beneficial to the user experience
if the user did not have to limit the sources manually to refine a search.
This project will evaluate different machine learning methods to classify relevant
sources to a query. The goal of this is having an automated learning system that
takes some labeled input and uses this to help inform or direct the user to the
The project will take advantage of a Yahoo! product; Yahoo! Query Linguist
Analysis Service (abbreviated QLAS from now on and through the document). The
goal is to incorporate semantic data from QLAS into the learning system. This
should augment the amount of information available to the learning system, and
improve its performance. It is not clear how this semantic data could be combined
with the training data and incorporated in the learning system. A substantial part
of the project will be to explore this.
This project was done in cooperation with Yahoo! Technologies Norway AS (YTN).
YTN develops Vespa, a search engine platform that has the possibility to search
from multiple sources. YTN is interested in researching the field of learning source
relevance to improve the search experience in Yahoo services. YTN is also interested
in researching ways data from QLAS could be used by Vespa to enable source
relevance classification when Vespa is used in a multiple-index setup.
Place, publisher, year, edition, pages
Institutt for datateknikk og informasjonsvitenskap , 2011. , 114 p.
ntnudaim:5772, MIT informatikk, Kunstig intelligens og læring
IdentifiersURN: urn:nbn:no:ntnu:diva-14468Local ID: ntnudaim:5772OAI: oai:DiVA.org:ntnu-14468DiVA: diva2:454079
Hetland, Magnus Lie, FørsteamanuensisBanino-Rokkones, Cyril