Cost-Based Optimization of Decision Support Queries Using Transient Views
Subbu N. Subramanian, Shivakumar Venkataraman
Full Paper (PDF)

Abstract
Next generation decision support applications, besides being capable of processing huge amounts of data, require the ability to integrate and reason over data from multiple, heterogeneous data sources. Often, these data sources differ in a variety of aspects such as their data models, the query languages they support, and their network protocols. Also, typically they are spread over a wide geographical area. The cost of processing decision support queries in such a setting is quite high. However, processing these queries often involves redundancies such as repeated access of same data source and multiple execution of similar processing sequences. Minimizing these redundancies would significantly reduce the query processing cost. In this paper, we (1) propose an architecture for processing complex decision support queries involving multiple, heterogeneous data sources; (2) introduce the notion of {\em transient-views} -- materialized views that exist only in the context of execution of a query -- that is useful for minimizing the redundancies involved in the execution of these queries; (3) develop a {\em cost-based} algorithm that takes a query plan as input and generates an optimal "covering plan", by minimizing redundancies in the original plan; (4) validate our approach by means of an implementation of the algorithms and a detailed performance study based on TPC-D benchmark queries on a commercial database system; and finally, (5) compare and contrast our approach with work in related areas, in particular, the areas of answering queries using views and optimization using common sub-expressions. Our experiments demonstrate the practicality and usefulness of transient-views in significantly improving the performance of decision support queries.

References

References, where available, link to the DBLP on the World Wide Web.

[Abi97]
Serge Abiteboul: Querying Semi-Structured Data. ICDT 1997: 1-18
[AMMT96]
...
[ASD+91]
Rafi Ahmed, Philippe De Smedt, Weimin Du, William Kent, Mohammad A. Ketabchi, Witold Litwin, Abbas Rafii, Ming-Chien Shan: The Pegasus Heterogeneous Multidatabase System. IEEE Computer 24(12): 19-27(1991)
[CAS94]
Vassilis Christophides, Serge Abiteboul, Sophie Cluet, Michel Scholl: From Structured Documents to Novel Query Facilities. SIGMOD Conference 1994: 313-324
[CGH+94]
Sudarshan S. Chawathe, Hector Garcia-Molina, Joachim Hammer, Kelly Ireland, Yannis Papakonstantinou, Jeffrey D. Ullman, Jennifer Widom: The TSIMMIS Project: Integration of Heterogeneous Information Sources. IPSJ 1994: 7-18
[CKPS95]
Surajit Chaudhuri, Ravi Krishnamurthy, Spyros Potamianos, Kyuseok Shim: Optimizing Queries with Materialized Views. ICDE 1995: 190-200
[DFJ+96]
Shaul Dar, Michael J. Franklin, Björn Þór Jónsson, Divesh Srivastava, Michael Tan: Semantic Data Caching and Replacement. VLDB 1996: 330-341
[Fin82]
Sheldon J. Finkelstein: Common Subexpression Analysis in Database Applications. SIGMOD Conference 1982: 235-245
[GBLP95]
Jim Gray, Surajit Chaudhuri, Adam Bosworth, Andrew Layman, Don Reichart, Murali Venkatrao, Frank Pellow, Hamid Pirahesh: Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross-Tab, and Sub Totals. Data Mining and Knowledge Discovery 1(1): 29-53(1997)
[GHRU97]
Himanshu Gupta, Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Index Selection for OLAP. ICDE 1997: 208-219
[GLS93]
Peter Gassner, Guy M. Lohman, K. Bernhard Schiefer, Yun Wang: Query Optimization in the IBM DB2 Family. Data Engineering Bulletin 16(4): 4-18(1993)
[GM80]
...
[GM81]
...
[Gup97]
Himanshu Gupta: Selection of Views to Materialize in a Data Warehouse. ICDT 1997: 98-112
[Hal76]
...
[HFLP89]
Laura M. Haas, Johann Christoph Freytag, Guy M. Lohman, Hamid Pirahesh: Extensible Query Processing in Starburst. SIGMOD Conference 1989: 377-388
[HKWY97]
Laura M. Haas, Donald Kossmann, Edward L. Wimmers, Jun Yang: Optimizing Queries Across Diverse Data Sources. VLDB 1997: 276-285
[HRU96]
Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Implementing Data Cubes Efficiently. SIGMOD Conf. 1996: 205-216
[Jar84]
...
[JV84]
Matthias Jarke, Jürgen Koch: Query Optimization in Database Systems. Computing Surveys 16(2): 111-152(1984)
[Kim84]
...
[LMSS95]
Alon Y. Levy, Alberto O. Mendelzon, Yehoshua Sagiv, Divesh Srivastava: Answering Queries Using Views. PODS 1995: 95-104
[LRO96]
Alon Y. Levy, Anand Rajaraman, Joann J. Ordille: Querying Heterogeneous Information Sources Using Source Descriptions. VLDB 1996: 251-262
[LSS96]
Laks V. S. Lakshmanan, Fereidoon Sadri, Iyer N. Subramanian: SchemaSQL - A Language for Interoperability in Relational Multi-Database Systems. VLDB 1996: 239-250
[PGW95]
Yannis Papakonstantinou, Hector Garcia-Molina, Jennifer Widom: Object Exchange Across Heterogeneous Information Sources. ICDE 1995: 251-260
[PHH92]
Hamid Pirahesh, Joseph M. Hellerstein, Waqar Hasan: Extensible/Rule Based Query Rewrite Optimization in Starburst. SIGMOD Conference 1992: 39-48
[Rou82a]
Nick Roussopoulos: The Logical Access Path Schema of a Database. TSE 8(6): 563-573(1982)
[Rou82b]
Nick Roussopoulos: View Indexing in Relational Databases. TODS 7(2): 258-290(1982)
[RSS96]
Kenneth A. Ross, Divesh Srivastava, S. Sudarshan: Materialized View Maintenance and Integrity Constraint Checking: Trading Space for Time. SIGMOD Conf. 1996: 447-458
[SAB+95]
...
[SAC+79]
Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price: Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979: 23-34
[SDJL96]
Divesh Srivastava, Shaul Dar, H. V. Jagadish, Alon Y. Levy: Answering Queries with Aggregation Using Views. VLDB 1996: 318-329
[Sel88]
Timos K. Sellis: Multiple-Query Optimization. TODS 13(1): 23-52(1988)
[SV98]
...
[TPC93]
...
[TRS97]
Mary Tork Roth, Peter M. Schwarz: Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources. VLDB 1997: 266-275
[TRV96]
Anthony Tomasic, Louiqa Raschid, Patrick Valduriez: Scaling Heterogeneous Databases and the Design of Disco. ICDCS 1996: 449-457
[VZ97]
...
[WY76]
Eugene Wong, Karel Youssefi: Decomposition - A Strategy for Query Processing. TODS 1(3): 223-241(1976)
BIBTEX

@inproceedings{DBLP:conf/sigmod/SubramanianV98,
author = {Subbu N. Subramanian and
Shivakumar Venkataraman},
editor = {Laura M. Haas and
Ashutosh Tiwary},
title = {Cost-Based Optimization of Decision Support Queries Using Transient
Views},
booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference
on Management of Data, June 2-4, 1998, Seattle, Washington, USA},
publisher = {ACM Press},
year = {1998},
isbn = {0-89791-955-5},
pages = {319-330},
crossref = {DBLP:conf/sigmod/98},
bibsource = {DBLP, http://dblp.uni-trier.de}
}


DBLP: Copyright ©1999 by Michael Ley (ley@uni-trier.de).