MindReader: Querying Databases Through Multiple Examples
Yoshiharu Ishikawa, Ravishankar Subramanya, Christos Faloutsos
Full Paper (PDF)

Abstract
Users often can not easily express their queries. For example, in a multimedia/image by content setting, the user might want photographs with sunsets; in current systems, like QBIC, the user has to give a sample query, andto specify the relative importance of color, shape and texture. Even worse, the user might want correlations between attributes, like, for example, in a traditional, medical record database, a medical researcher might wantto find "mildly overweight patients", where the implied query would be "weight/height ~ 4 lb/inch".

Our goal is to provide a user-friendly, but theoretically solid method, tohandle such queries. We allow the user to give several examples, and, optionally, their 'goodness' scores, and we propose a novel method to "guess" which attributes are important, which correlations are important, and withwhat weight.

Our contributions are twofold: (a) we formalize the problem as a minimization problem and show how to solve for the optimal solution, completely avoiding the ad-hoc heuristics of the past. (b) Moreover, we are the first that can handle 'diagonal' queries (like the 'overweight' query above). Experiments on synthetic and real datasets show that our method estimates quickly and accurately the 'hidden' distance function in the user's mind.


References

References, where available, link to the DBLP on the World Wide Web.

[BKK96]
Stefan Berchtold, Daniel A. Keim, Hans-Peter Kriegel: The X-tree : An Index Structure for High-Dimensional Data. VLDB 1996: 28-39
[BKSS90]
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger: The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles. SIGMOD Conference 1990: 322-331
[CK97]
Michael J. Carey, Donald Kossmann: Processing Top N and Bottom N Queries. Data Engineering Bulletin 20(3): 12-19(1997)
[CKM+95]
Michael G. Christel, Takeo Kanade, M. Mauldin, Raj Reddy, Marvin Sirbu, Scott M. Stevens, Howard D. Wactlar: Informedia Digital Video Library. CACM 38(4): 57-58(1995)
[FBF+94]
Christos Faloutsos, Ron Barber, Myron Flickner, Jim Hafner, Wayne Niblack, Dragutin Petkovic, William Equitz: Efficient and Effective Querying by Image Content. JIIS 3(3/4): 231-262(1994)
[FK94]
Christos Faloutsos, Ibrahim Kamel: Beyond Uniformity and Independence: Analysis of R-trees Using the Concept of Fractal Dimension. PODS 1994: 4-13
[FL95]
Christos Faloutsos, King-Ip Lin: FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets. SIGMOD Conference 1995: 163-174
[GV96]
...
[Har92]
...
[HK92]
Kyoji Hirata, Toshikazu Kato: Query by Visual Example - Content based Image Retrieval. EDBT 1992: 56-71
[KS97]
Norio Katayama, Shin'ichi Satoh: The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries. SIGMOD Conference 1997: 369-380
[Mot88]
Amihai Motro: VAGUE: A User Interface to Relational Databases that Permits Vague Queries. TOIS 6(3): 187-214(1988)
[RHM97]
...
[RHM98]
...
[Roc71]
...
[SK97]
Thomas Seidl, Hans-Peter Kriegel: Efficient User-Adaptable Similarity Search in Large Multimedia Databases. VLDB 1997: 506-515
[SL96]
...
[Vir]
...
BIBTEX

@inproceedings{DBLP:conf/vldb/IshikawaSF98,
author = {Yoshiharu Ishikawa and
Ravishankar Subramanya and
Christos Faloutsos},
editor = {Ashish Gupta and
Oded Shmueli and
Jennifer Widom},
title = {MindReader: Querying Databases Through Multiple Examples},
booktitle = {VLDB'98, Proceedings of 24rd International Conference on Very
Large Data Bases, August 24-27, 1998, New York City, New York,
USA},
publisher = {Morgan Kaufmann},
year = {1998},
isbn = {1-55860-566-5},
pages = {218-227},
crossref = {DBLP:conf/vldb/98},
bibsource = {DBLP, http://dblp.uni-trier.de}
}


DBLP: Copyright ©1999 by Michael Ley (ley@uni-trier.de).