Cost Based Query Scrambling for Initial Delays
Tolga Urhan, Michael J. Franklin, Laurent Amsaleg
Full Paper (PDF)

Abstract
Remote data access from disparate sources across a wide-area network such as the Internet is problematic due to the unpredictable nature of the communications medium and the lack of knowledge about the load and potential delays at remote sites. Traditional, static, query processing approaches break down in this environment because they are unable to adapt in response to unexpected delays. Query scrambling has been proposed to address this problem. Scrambling modifies query execution plans on-the-fly when delays are encountered during runtime. In its original formulation, scrambling was based on simple heuristics, which although providing good performance in many cases, were also shown to be susceptible to problems resulting from bad scrambling decisions. In this paper we address these shortcomings by investigating ways to exploit query optimization technology to aid in making intelligent scrambling choices. We propose three different approaches to using query optimization for scrambling. These approaches vary, for example, in whether they optimize for total work or response-time, and whether they construct partial or complete alternative plans. Using a two-phase randomized query optimizer, a distributed query processing simulator, and a workload derived from queries of the TPC-D benchmark, we evaluate these different approaches and compare their ability to cope with initial delays in accessing remote sources. The results show that cost-based scrambling can effectively hide initial delays, but that in the absence of good predictions of expected delay durations, there are fundamental tradeoffs between risk aversion and effectiveness.

References

References, where available, link to the DBLP on the World Wide Web.

[ABF+97]
Laurent Amsaleg, Philippe Bonnet, Michael J. Franklin, Anthony Tomasic, Tolga Urhan: Improving Responsiveness for Wide-Area Data Access. Data Engineering Bulletin 20(3): 3-11(1997)
[AFT98]
...
[AFTU96]
Laurent Amsaleg, Michael J. Franklin, Anthony Tomasic, Tolga Urhan: Scrambling Query Plans to Cope With Unexpected Delays. PDIS 1996: 0-
[Ant93]
Gennady Antoshenkov: Dynamic Query Optimization in Rdb/VMS. ICDE 1993: 538-547
[Bro92]
...
[CBTY89]
...
[CD96]
Michael J. Carey, David J. DeWitt: Of Objects and Databases: A Decade of Turmoil. VLDB 1996: 3-14
[CG94]
Richard L. Cole, Goetz Graefe: Optimization of Dynamic Query Evaluation Plans. SIGMOD Conference 1994: 150-160
[DSD95]
Weimin Du, Ming-Chien Shan, Umeshwar Dayal: Reducing Multidatabase Query Response Time by Tree Balancing. SIGMOD Conference 1995: 293-303
[GHK92]
Sumit Ganguly, Waqar Hasan, Ravi Krishnamurthy: Query Optimization for Parallel Execution. SIGMOD Conference 1992: 9-18
[Gra93]
Goetz Graefe: Query Evaluation Techniques for Large Databases. Computing Surveys 25(2): 73-170(1993)
[IK90]
Yannis E. Ioannidis, Younkyung Cha Kang: Randomized Algorithms for Optimizing Large Join Queries. SIGMOD Conference 1990: 312-321
[INSS92]
Yannis E. Ioannidis, Raymond T. Ng, Kyuseok Shim, Timos K. Sellis: Parametric Query Optimization. VLDB 1992: 103-114
[IW87]
Yannis E. Ioannidis, Eugene Wong: Query Optimization by Simulated Annealing. SIGMOD Conference 1987: 9-22
[Kim95]
Won Kim (Ed.): Modern Database Systems: The Object Model, Interoperability, and Beyond. ACM Press and Addison-Wesley 1995, ISBN 0-201-59098-0
Contents
[MMM97]
Alberto O. Mendelzon, George A. Mihaila, Tova Milo: Querying the World Wide Web. PDIS 1996: 80-91
[ML86]
Lothar F. Mackert, Guy M. Lohman: R* Optimizer Validation and Performance Evaluation for Distributed Queries. VLDB 1986: 149-159
[ONK+97]
Fatma Ozcan, Sena Nural, Pinar Koksal, Cem Evrendilek, Asuman Dogac: Dynamic Query Optimization in Multidatabases. Data Engineering Bulletin 20(3): 38-45(1997)
[RAH+96]
Mary Tork Roth, Manish Arya, Laura M. Haas, Michael J. Carey, William F. Cody, Ronald Fagin, Peter M. Schwarz, Joachim Thomas II, Edward L. Wimmers: The Garlic Project. SIGMOD Conf. 1996: 557
[SAC+79]
Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price: Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979: 23-34
[SAD+95]
...
[Sha86]
Leonard D. Shapiro: Join Processing in Database Systems with Large Main Memories. TODS 11(3): 239-264(1986)
[SLR97]
Praveen Seshadri, Miron Livny, Raghu Ramakrishnan: The Case for Enhanced Abstract Data Types. VLDB 1997: 66-75
[THMB95]
Marjorie Templeton, Herbert Henley, Edward Maros, Darrel J. Van Buer: InterViso: Dealing With the Complexity of Federated Database Access. VLDB Journal 4(2): 287-317(1995)
[Tra97]
...
[TRV96]
Anthony Tomasic, Louiqa Raschid, Patrick Valduriez: Scaling Heterogeneous Databases and the Design of Disco. ICDCS 1996: 449-457
BIBTEX

@inproceedings{DBLP:conf/sigmod/UrhanFA98,
author = {Tolga Urhan and
Michael J. Franklin and
Laurent Amsaleg},
editor = {Laura M. Haas and
Ashutosh Tiwary},
title = {Cost Based Query Scrambling for Initial Delays},
booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference
on Management of Data, June 2-4, 1998, Seattle, Washington, USA},
publisher = {ACM Press},
year = {1998},
isbn = {0-89791-955-5},
pages = {130-141},
crossref = {DBLP:conf/sigmod/98},
bibsource = {DBLP, http://dblp.uni-trier.de}
}


DBLP: Copyright ©1999 by Michael Ley (ley@uni-trier.de).