|
What can Hierarchies do for Data Warehouses?
|
H. V. Jagadish,
Laks V. S. Lakshmanan, and
Divesh Srivastava
View Paper (PDF)
Return to High-Dimensional Queries
Data in a warehouse typically has multiple dimensions of interest, such as location, time, and product. It is well-recognized that these dimensions have hierarchies defined on them, such as ``store-city-state-region'' for location. The standard way to model such data is with a star/snowflake schema. However, current approaches do not give a first-class status to dimensions. Consequently, a substantial class of interesting queries involving dimension hierarchies and their interaction with the fact tables are quite verbose to write, hard to read, and difficult to optimize.
We propose the SQL(H) model and a natural extension to the SQL query language, that gives a first-class status to dimensions, and we pin down its semantics. Our model permits structural and schematic heterogeneity in dimension hierarchies, situations often arising in practice that cannot be modeled satisfactorily using the star/snowflake approach. We show using examples that sophisticated queries involving dimension hierarchies and their interplay with aggregation can be expressed concisely in SQL(H). By comparison, expressing such queries in SQL would involve a union of numerous complex sequences of joins. Finally, we develop an efficient implementation strategy for computing SQL queries, based on an algorithm for hierarchical joins, and the use of dimension indexes.
Note: References link to DBLP on the Web.
-
[1]
-
Sameet Agarwal
,
Rakesh Agrawal
,
Prasad Deshpande
,
Ashish Gupta
,
Jeffrey F. Naughton
,
Raghu Ramakrishnan
,
Sunita Sarawagi
: On the Computation of Multidimensional Aggregates.
VLDB 1996
: 506-521
-
[2]
-
Elena Baralis
,
Stefano Paraboschi
,
Ernest Teniente
: Materialized Views Selection in a Multidimensional Database.
VLDB 1997
: 156-165
-
[3]
-
Luca Cabibbo
,
Riccardo Torlone
: Querying Multidimensional Databases.
DBPL 1997
: 319-335
-
[4]
-
Chee Yong Chan
,
Yannis E. Ioannidis
: Bitmap Index Design and Evaluation.
SIGMOD Conference 1998
: 355-366
-
[5]
-
Damianos Chatziantoniou
,
Kenneth A. Ross
: Querying Multiple Features of Groups in Relational Databases.
VLDB 1996
: 295-306
-
[6]
-
Surajit Chaudhuri
,
Umeshwar Dayal
: An Overview of Data Warehousing and OLAP Technology.
SIGMOD Record 26(1)
: 65-74(1997)
-
[7]
-
Douglas Comer
: The Ubiquitous B-Tree.
Computing Surveys 11(2)
: 121-137(1979)
-
[8]
-
Jim Gray
,
Adam Bosworth
,
Andrew Layman
,
Hamid Pirahesh
: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total.
ICDE 1996
: 152-159
-
[9]
-
Venky Harinarayan
,
Anand Rajaraman
,
Jeffrey D. Ullman
: Implementing Data Cubes Efficiently.
SIGMOD Conf. 1996
: 205-216
-
[10]
-
Carlos A. Hurtado
,
Alberto O. Mendelzon
,
Alejandro A. Vaisman
: Maintaining Data Cubes under Dimension Updates.
ICDE 1999
: 346-355
-
[11]
-
H. V. Jagadish
,
Laks V. S. Lakshmanan
,
Tova Milo
,
Divesh Srivastava
,
Dimitra Vista
: Querying Network Directories.
SIGMOD Conference 1999
: 133-144
-
[12]
-
Ralph Kimball
: The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses. John Wiley 1996, ISBN 0-471-15337-0
-
[13]
-
Laks V. S. Lakshmanan
,
Fereidoon Sadri
,
Iyer N. Subramanian
: SchemaSQL - A Language for Interoperability in Relational Multi-Database Systems.
VLDB 1996
: 239-250
-
[14]
-
Wolfgang Lehner
: Modelling Large Scale OLAP Scenarios.
EDBT 1998
: 153-167
-
[15]
-
Witold Litwin
: Linear Hashing: A New Tool for File and Table Addressing.
VLDB 1980
: 212-223
-
[16]
-
Patrick E. O'Neil
: Model 204 Architecture and Performance.
HPTS 1987
: 40-59
-
[17]
-
Patrick E. O'Neil
,
Goetz Graefe
: Multi-Table Joins Through Bitmapped Join Indices.
SIGMOD Record 24(3)
: 8-11(1995)
-
[18]
-
Patrick E. O'Neil
,
Dallan Quass
: Improved Query Performance with Variant Indexes.
SIGMOD Conference 1997
: 38-49
-
[19]
-
Kenneth A. Ross
,
Divesh Srivastava
: Fast Computation of Sparse Datacubes.
VLDB 1997
: 116-125
-
[20]
-
Kenneth A. Ross
,
Divesh Srivastava
,
Damianos Chatziantoniou
: Complex Aggregation at Multiple Granularities.
EDBT 1998
: 263-277
-
[21]
-
Patrick Valduriez
: Join Indices.
TODS 12(2)
: 218-246(1987)
-
[22]
-
Jennifer Widom
: Research Problems in Data Warehousing.
CIKM 1995
: 25-30
-
[23]
-
Yihong Zhao
,
Prasad Deshpande
,
Jeffrey F. Naughton
: An Array-Based Algorithm for Simultaneous Multidimensional Aggregates.
SIGMOD Conference 1997
: 159-170
@inproceedings{DBLP:conf/vldb/JagadishLS99,
author = {H. V. Jagadish and
Laks V. S. Lakshmanan and
Divesh Srivastava},
editor = {Malcolm P. Atkinson and
Maria E. Orlowska and
Patrick Valduriez and
Stanley B. Zdonik and
Michael L. Brodie},
title = {What can Hierarchies do for Data Warehouses?},
booktitle = {VLDB'99, Proceedings of 25th International Conference on Very
Large Data Bases, September 7-10, 1999, Edinburgh, Scotland,
UK},
publisher = {Morgan Kaufmann},
year = {1999},
isbn = {1-55860-615-5},
pages = {530-541},
crossref = {DBLP:conf/vldb/99},
bibsource = {DBLP, http://dblp.uni-trier.de} } },
Copyright(C) 2000 ACM
|