Filtering with Approximate Predicates
Narayanan Shivakumar, Hector Garcia-Molina, Chandra Chekuri
Full Paper (PDF)

Abstract
Approximate predicates can be used to reduce the number of comparisons made by expensive, complex predicates. For example, to check if a point is within a region (expensive predicate) we can first check if the point is within a bounding rectangle (approximate predicate). In general, approximate predicates may have false positive and false negative errors. In this paper we study the problem of selecting and structuring approximate predicates in order to reduce the cost of processing a user query, whilekeeping errors within user-specified bounds. We model different types of approximate predicates and their dependencies,we derive expressions for the errors of compound predicates, and we develop query optimization strategies. We also study the complexity of our optimization strategies under various scenarios, and we present an experimental case study that illustrates the potential gains achieved by optimizing queries with approximate predicates.

References

References, where available, link to the DBLP on the World Wide Web.

[Aro96]
...
[CGK89]
Danette Chimenti, Ruben Gamboa, Ravi Krishnamurthy: Towards on Open Architecture for LDL. VLDB 1989: 195-203
[CGMP96]
Chen-Chuan K. Chang, Hector Garcia-Molina, Andreas Paepcke: Boolean Query Mapping Across Heterogeneous Information Sources. TKDE 8(4): 515-521(1996)
[CK94]
Surajit Chaudhuri, Phokion G. Kolaitis: Can Datalog be Approximated? PODS 1994: 86-96
[CS93]
Surajit Chaudhuri, Kyuseok Shim: Query Optimization in the Presence of Foreign Functions. VLDB 1993: 529-542
[CS96]
Surajit Chaudhuri, Kyuseok Shim: Optimization of Queries with User-defined Predicates. VLDB 1996: 87-98
[CS97]
...
[ea95]
Myron Flickner, Harpreet S. Sawhney, Jonathan Ashley, Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee, Dragutin Petkovic, David Steele, Peter Yanker: Query by Image and Video Content: The QBIC System. IEEE Computer 28(9): 23-32(1995)
[GJ79]
M. R. Garey, David S. Johnson: Computer and Intractability: A Guide to NP-Completeness. W. H. Freeman 1979, ISBN 0-7167-1044-7
[HNSS96]
Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Arun N. Swami: Selectivity and Cost Estimation for Joins Based on Random Sampling. JCSS 52(3): 550-569(1996)
[HS93]
Joseph M. Hellerstein, Michael Stonebraker: Predicate Migration: Optimizing Queries with Expensive Predicates. SIGMOD Conference 1993: 267-276
[IM97]
...
[LYGM98]
...
[ME97]
Alvaro E. Monge, Charles Elkan: An Efficient Domain-Independent Algorithm for Detecting Approximately Duplicate Database Records. DMKD 1997: 0-
[PD96]
Jignesh M. Patel, David J. DeWitt: Partition Based Spatial-Merge Join. SIGMOD Conf. 1996: 259-270
[Ros96]
...
[SAC+79]
Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price: Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979: 23-34
[SB88]
Gerard Salton, Chris Buckley: Term-Weighting Approaches in Automatic Text Retrieval. Information Processing and Management 24(5): 513-523(1988)
[Sea96]
Praveen Seshadri, Joseph M. Hellerstein, Hamid Pirahesh, T. Y. Cliff Leung, Raghu Ramakrishnan, Divesh Srivastava, Peter J. Stuckey, S. Sudarshan: Cost-Based Optimization for Magic: Algebra and Implementation. SIGMOD Conf. 1996: 435-446
[SGM95]
Narayanan Shivakumar, Hector Garcia-Molina: SCAM: A Copy Detection Mechanism for Digital Documents. DL 1995: 0-
[SGM96]
Narayanan Shivakumar, Hector Garcia-Molina: Building a Scalable and Accurate Copy Detection Mechanism. Digital Libraries 1996: 160-168
[SGMC98]
...
[Ull88]
Jeffrey D. Ullman: Principles of Database and Knowledge-Base Systems, Volume I. Computer Science Press 1988, ISBN 0-7167-8158-1
[Vas98]
...
[VP97]
Vasilis Vassalos, Yannis Papakonstantinou: Describing and Using Query Capabilities of Heterogeneous Sources. VLDB 1997: 256-265
[YI95]
Yannis E. Ioannidis, Viswanath Poosala: Histogram-Based Solutions to Diverse Database Estimation Problems. Data Engineering Bulletin 18(3): 10-18(1995)
BIBTEX

@inproceedings{DBLP:conf/vldb/ShivakumarGC98,
author = {Narayanan Shivakumar and
Hector Garcia-Molina and
Chandra Chekuri},
editor = {Ashish Gupta and
Oded Shmueli and
Jennifer Widom},
title = {Filtering with Approximate Predicates},
booktitle = {VLDB'98, Proceedings of 24rd International Conference on Very
Large Data Bases, August 24-27, 1998, New York City, New York,
USA},
publisher = {Morgan Kaufmann},
year = {1998},
isbn = {1-55860-566-5},
pages = {263-274},
crossref = {DBLP:conf/vldb/98},
bibsource = {DBLP, http://dblp.uni-trier.de}
}


DBLP: Copyright ©1999 by Michael Ley (ley@uni-trier.de).