The Pyramid-Tree: Breaking the Curse of Dimensionality
Stefan Berchtold, Christian Bohm, Hans-Peter Kriegel
Full Paper (PDF)

Abstract
In this paper, we propose the Pyramid-Technique, a new indexing method for high-dimensional data spaces. The Pyramid-Technique is highly adapted to range query processing using the maximum metric Lmax. In contrast to all other index structures, the performance of the Pyramid-Technique does not deteriorate when processing range queries on data of higher dimensionality. The Pyramid-Technique is based on a special partitioning strategy which is optimized for high-dimensional data. The basic idea is to divide the data space first into 2d pyramids sharing the center point of the space as a top. In a second step, the single pyramids are cut into slices parallel to the basis of the pyramid. These slices form the data pages. Furthermore, we show that this partition provides a mapping from the given d-dimensional space to a 1-dimensional space. Therefore, we are able to use a B+-tree to manage the transformed data. As an analytical evaluation of our technique for hypercube range queries and uniform data distribution shows, the Pyramid-Technique clearly outperforms index structures using other partitioning strategies. To demonstrate the practical relevance of our technique, we experimentally compared the Pyramid-Technique with the X-tree, the Hilbert R-tree, and the Linear Scan. The results of our experiments using both, synthetic and real data, demonstrate that the Pyramid-Technique outperforms the X-tree and the Hilbert R-tree by a factor of up to 14 (number of page accesses) and up to 2500 (total elapsed time) for range queries.

References

References, where available, link to the DBLP on the World Wide Web.

[1]
Rudolf Bayer, Edward M. McCreight: Organization and Maintenance of Large Ordered Indices. Acta Informatica 1: 173-189(1972)
[2]
Stefan Berchtold, Christian Böhm, Bernhard Braunmüller, Daniel A. Keim, Hans-Peter Kriegel: Fast Similarity Search in Multimedia Databases. SIGMOD Conference 1997: 1-12
[3]
Stefan Berchtold, Christian Böhm, Hans-Peter Kriegel: Improving the Query Performance of High-Dimensional Index Structures by Bulk-Load Operations. EDBT 1998: 216-230
[4]
...
[5]
Stefan Berchtold, Christian Böhm, Daniel A. Keim, Hans-Peter Kriegel: A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space. PODS 1997: 78-86
[6]
Stefan Berchtold, Daniel A. Keim, Hans-Peter Kriegel: The X-tree : An Index Structure for High-Dimensional Data. VLDB 1996: 28-39
[7]
Stefan Berchtold, Bernhard Ertl, Daniel A. Keim, Hans-Peter Kriegel, Thomas Seidl: Fast Nearest Neighbor Search in High-Dimensional Space. ICDE 1998: 209-218
[8]
...
[9]
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger: The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles. SIGMOD Conference 1990: 322-331
[10]
Douglas Comer: The Ubiquitous B-Tree. Computing Surveys 11(2): 121-137(1979)
[11]
Surajit Chaudhuri, Umeshwar Dayal: Data Warehousing and OLAP for Decision Support (Tutorial). SIGMOD Conference 1997: 507-508
[12]
Christos Faloutsos, Ron Barber, Myron Flickner, Jim Hafner, Wayne Niblack, Dragutin Petkovic, William Equitz: Efficient and Effective Querying by Image Content. JIIS 3(3/4): 231-262(1994)
[13]
Christos Faloutsos, Pravin Bhagwat: Declustering Using Fractals. PDIS 1993: 18-25
[14]
Jerome H. Friedman, Jon Louis Bentley, Raphael A. Finkel: An Algorithm for Finding Best Matches in Logarithmic Expected Time. TOMS 3(3): 209-226(1977)
[15]
Gisli R. Hjaltason, Hanan Samet: Ranking in Spatial Databases. SSD 1995: 83-95
[16]
H. V. Jagadish: A Retrieval Technique for Similar Shapes. SIGMOD Conference 1991: 208-217
[17]
David A. White, Ramesh Jain: Similarity Indexing: Algorithms and Performance. Storage and Retrieval for Image and Video Databases (SPIE) 1996: 62-73
[18]
Norio Katayama, Shin'ichi Satoh: The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries. SIGMOD Conference 1997: 369-380
[19]
King-Ip Lin, H. V. Jagadish, Christos Faloutsos: The TV-Tree: An Index Structure for High-Dimensional Data. VLDB Journal 3(4): 517-542(1994)
[20]
Rajiv Mehrotra, James E. Gary: Feature-Based Retrieval of Similar Shapes. ICDE 1993: 108-115
[21]
John T. Robinson: The K-D-B-Tree: A Search Structure For Large Multidimensional Dynamic Indexes. SIGMOD Conference 1981: 10-18
[22]
Thomas Seidl, Hans-Peter Kriegel: Efficient User-Adaptable Similarity Search in Large Multimedia Databases. VLDB 1997: 506-515
[23]
David A. White, Ramesh Jain: Similarity Indexing with the SS-tree. ICDE 1996: 516-523
BIBTEX

@inproceedings{DBLP:conf/sigmod/BerchtoldBK98,
author = {Stefan Berchtold and
Christian B{\"o}hm and
Hans-Peter Kriegel},
editor = {Laura M. Haas and
Ashutosh Tiwary},
title = {The Pyramid-Tree: Breaking the Curse of Dimensionality},
booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference
on Management of Data, June 2-4, 1998, Seattle, Washington, USA},
publisher = {ACM Press},
year = {1998},
isbn = {0-89791-955-5},
pages = {142-153},
crossref = {DBLP:conf/sigmod/98},
bibsource = {DBLP, http://dblp.uni-trier.de}
}


DBLP: Copyright ©1999 by Michael Ley (ley@uni-trier.de).