2003 Digital Symposium Collection

Distributed Top-N Query Processing with Possibly Uncooperative Local Systems

Clement T. Yu, George Philip, and Weiyi Meng
View Paper (PDF)

Return to Query Processing in the Web (Session A3)

Abstract

We consider the problem of processing top-N queries in a distributed environment with pos- sibly uncooperative local database systems. For a given top-N query, the problem is to find the N tuples that satisfy the query the best but not necessarily completely in an efficient manner. Top-N queries are gaining popular- ity in relational databases and are expected to be very useful for e-commerce applications. Many companies provide the same type of goods and services to the public on the Web, and relational databases may be employed to manage the data. It is not feasible for a user to query a large number of databases. It is there- fore desirable to provide a facility where a user query is accepted at some site, suitable tuples from appropriate sites are retrieved and the results are merged and then presented to the user. In this paper, we present a method for constructing the desired facility. Our method consists of two steps. The first step deter- mines which databases are likely to contain the desired tuples for a given query so that the databases can be ranked based on their desirability with respect to the query. Four different techniques are introduced for this step with one requiring no cooperation from local systems. The second step determines how the ranked databases should be searched and what tuples from the searched databases should be returned. A new algorithm is pro- posed for this purpose. Experimental results are presented to compare different methods and very promising results are obtained us- ing the method that requires no cooperation from local databases.