CTIT University of Twente
Research Business & Innovation About CTIT Research Calls Looking for a job? Intranet

DIRKA: Distributed Information Retrieval by means of Keyword Auctions (NWO-Vidi)

NWO-Vidi / No. 639.022.809
Project Manager: Dr. ir. Djoerd Hiemstra
Faculty of Electrical Engineering, Mathematics and Computer Science
Tel.: +31-53-4892335
Email: hiemstra@cs.utwente.nl


The project’s aim is to distribute internet search functionality in such a way that communities of users and/or federations of small search systems provide search services in a collaborative way. Instead of getting all data to a centralized point and process queries centrally, as is done by today’s search systems, the project will distribute queries over many small autonomous search systems and process them locally. Distributed information retrieval is a well researched sub area of information retrieval, but it has not resulted in practical solutions for large scale search problems because of high administration costs of setting up large numbers of installations and because it turns out to be hard in practice to direct queries to the appropriate local search systems. In this project we will research a radical new approach to distribute search: distributed information retrieval by means of keyword auctions. Keyword auctions like Google’s AdWords give advertisers the opportunity to provide targeted advertisements by bidding on specific keywords. Analogous to these keyword auctions, local search systems will bid for keywords at a central broker. They “pay” by serving queries for the broker. The broker will send queries to those local search systems that optimize the overall effectiveness of the system, i.e., local search systems that are willing to serve many queries, but also are able to provide high quality results. The project will approach the problem from three different angles: 1) modeling the local search system, including models for automatic bidding and multi-word keywords; 2) modeling the search broker’s optimization using the bids, the quality of the answers, and click-through rates; 3) integration of structured data typically available behind web forms of local search systems with text search. The approaches will be evaluated using prototype systems and simulations on benchmark test collections.


Project duration: 1-10-2008/1-10-2013
Project budget: 575 k-€
Number of person/years: 2.4 fte/year
Involved groups: Databases (DB)