Welcome to D
SIGMOD'00
 = SIGMOD'00 We
 = Plenary Talk
<<< = SIGMOD'00 Pa>>>
PODS'00
SIGMOD Recor
CIKM 2000/CI
COMAD 2000
Data Enginee
DL 2000
DPDJ
EDBT 2000
Hypertext 20
ICDE 2000
KDD 2000
KDD Explorat
KRDB 2000
SBBD 2000
SIGIR 2000
SIGIR Forum
SSDBM 2000
TODS
VLDB'00
VLDBJ

WSQ/DSQ: A Practical Approach for Combined Querying of Databases and the Web


Roy Goldman and Jennifer Widom

  View Paper (PDF)  

Return to Research Sessions


Abstract

We present WSQ/DSQ (pronounced "wisk-disk"), a new approach for combining the query facilities of traditional databases with existing search engines on the Web. WSQ, for Web-Supported (Database) Queries, leverages results from Web searches to enhance SQL queries over a relational database. DSQ, for Database-Supported (Web) Queries, uses information stored in the database to enhance and explain Web searches. This paper focuses primarily on WSQ, describing a simple, low-overhead way to support WSQ in a relational DBMS, and demonstrating the utility of WSQ with a number of interesting queries and results. The queries supported by WSQ are enabled by two virtual tables, whose tuples represent Web search results generated dynamically during query execution. WSQ query execution may involve many high-latency calls to one or more search engines, during which the query processor is idle. We present a lightweight technique called asynchronous iteration that can be integrated easily into a standard sequential query processor to enable concurrency between query processing and multiple Web search requests. Asynchronous iteration has broader applications than WSQ alone, and it opens up many interesting query optimization issues. We have developed a prototype implementation of WSQ by extending a DBMS with virtual tables and asynchronous iteration; performance results are reported.


References


Note: References link to DBLP on the Web.

[Abi97]
Serge Abiteboul : Querying Semi-Structured Data. ICDT 1997 : 1-18
[BT98]
Philippe Bonnet , Anthony Tomasic : Partial Answers for Unavailable Data Sources. FQAS 1998 : 43-54
[CDY95]
Surajit Chaudhuri , Umeshwar Dayal , Tak W. Yan : Join Queries with External Text Sources: Execution and Optimization Techniques. SIGMOD Conference 1995 : 410-422
[CGK89]
Danette Chimenti , Ruben Gamboa , Ravi Krishnamurthy : Towards on Open Architecture for LDL. VLDB 1989 : 195-203
[CGMH+94]
Sudarshan S. Chawathe , Hector Garcia-Molina , Joachim Hammer , Kelly Ireland , Yannis Papakonstantinou , Jeffrey D. Ullman , Jennifer Widom : The TSIMMIS Project: Integration of Heterogeneous Information Sources. IPSJ 1994 : 7-18
[DFF+99]
Alin Deutsch , Mary F. Fernandez , Daniela Florescu , Alon Y. Levy , Dan Suciu : A Query Language for XML. WWW8 / Computer Networks 31(11-16) : 1155-1169(1999)
[DGS+90]
David J. DeWitt , Shahram Ghandeharizadeh , Donovan A. Schneider , Allan Bricker , Hui-I Hsiao , Rick Rasmussen : The Gamma Database Machine Project. TKDE 2(1) : 44-62(1990)
[DM97]
Stefan Deßloch , Nelson Mendonça Mattos : Integrating SQL Databases with Content-Specific Search Engines. VLDB 1997 : 528-537
[FLMS99]
Daniela Florescu , Alon Y. Levy , Ioana Manolescu , Dan Suciu : Query Optimization in the Presence of Limited Access Patterns. SIGMOD Conference 1999 : 311-322
[GMUW00]
Hector Garcia-Molina , Jeffrey D. Ullman , Jennifer Widom : Database System Implementation. Prentice-Hall 2000, ISBN 1-13-040264-8
Contents
[GMW99]
Roy Goldman , Jason McHugh , Jennifer Widom : From Semistructured Data to XML: Migrating the Lore Data Model and Query Language. WebDB (Informal Proceedings) 1999 : 25-30
[Gra90]
Goetz Graefe : Encapsulation of Parallelism in the Volcano Query Processing System. SIGMOD Conference 1990 : 102-111
[Gra93]
Goetz Graefe : Query Evaluation Techniques for Large Databases. ACM Computing Surveys 25(2) : 73-170(1993)
[HKWY97]
Laura M. Haas , Donald Kossmann , Edward L. Wimmers , Jun Yang : Optimizing Queries Across Diverse Data Sources. VLDB 1997 : 276-285
[HN96]
Joseph M. Hellerstein , Jeffrey F. Naughton : Query Execution Techniques for Caching Expensive Methods. SIGMOD Conf. 1996 : 423-434
[IBM]
...
[KS95]
David Konopnicki , Oded Shmueli : W3QS: A Query System for the World-Wide Web. VLDB 1995 : 54-65
[LRO96]
Alon Y. Levy , Anand Rajaraman , Joann J. Ordille : Querying Heterogeneous Information Sources Using Source Descriptions. VLDB 1996 : 251-262
[MMM97]
Alberto O. Mendelzon , George A. Mihaila , Tova Milo : Querying the World Wide Web. Int. J. on Digital Libraries 1(1) : 54-67(1997)
[PDZ99]
...
[PGGMU95]
Yannis Papakonstantinou , Ashish Gupta , Hector Garcia-Molina , Jeffrey D. Ullman : A Query Translation Scheme for Rapid Implementation of Wrappers. DOOD 1995 : 161-186
[RP98]
Berthold Reinwald , Hamid Pirahesh : SQL Open Heterogeneous Data Access. SIGMOD Conference 1998 : 506-507
[RS97]
Mary Tork Roth , Peter M. Schwarz : Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources. VLDB 1997 : 266-275
[RSU95]
Anand Rajaraman , Yehoshua Sagiv , Jeffrey D. Ullman : Answering Queries Using Templates with Binding Patterns. PODS 1995 : 105-112
[SBH98]
Michael Stonebraker , Paul Brown , Martin Herbach : Interoperability, Distributed Applications and Distributed Databases: The Virtual Table Interface. Data Engineering Bulletin 21(3) : 25-33(1998)
[Uni98]
...
[XML97]
W3C: Extensible Markup Language (XML) 1.0. http://www.w3.org/TR/REC-xml

BIBTEX


@inproceedings{DBLP:conf/sigmod/GoldmanW00,
  author    = {Roy Goldman and
                Jennifer Widom},
   editor    = {Weidong Chen and
                Jeffrey F. Naughton and
                Philip A. Bernstein},
   title     = {WSQ/DSQ: A Practical Approach for Combined Querying of Databases
                and the Web},
   booktitle = {Proceedings of the 2000 ACM SIGMOD International Conference on
                Management of Data, May 16-18, 2000, Dallas, Texas, USA},
   journal   = {SIGMOD Record},
   publisher = {ACM},
   volume    = {29},
   number    = {2},
   year      = {2000},
   isbn      = {1-58113-218-2},
   pages     = {285-296},
   crossref  = {DBLP:conf/sigmod/2000},
   bibsource = {DBLP, http://dblp.uni-trier.de} } },




DiSC'01 Copyright ©2002 ACM Inc.