Welcome to DiSC 2002
SIGMOD 2001
 = SIGMOD'01 Website
 = SIGMOD/PODS'01 Plena
<<< = SIGMOD'01 Papers>>>
 = Demos
 = Industrial Sessions
 = Panels
 = Tutorials
PODS 2001
 SIGMOD RECORD 2001
CIKM 2001
CoopIS 2001
DASFAA 2001
DASFAA 2000
DBPL 2001
Data Engineering Bul
DEXA_EC-WEB 2001
DMKD 2001
 DPDJ 2001
HYPERTEXT 2001
ICDE 2001
ICDM 2001
ICDT 2001
JCDL 2001
KDD 2001
 KDD_EXPLORATIONS 20
KRDB 2001
MDM 2001
MIR 2001
MIS 2001
RIDE 2001
SBBD 2001
 SIGIR 2001
 SIGIR FORUM 2001
SSDBM 2001
SSTD 2001
TODS 2001
TIME 2001
VLDB 2001
VLDBJ 2001

Query optimization in compressed database systems


Zhiyuan Chen, Johannes Gehrke, and Flip Korn

  View Paper (PDF)  

Return to Compression


Abstract

Over the last decades,mprovements n CPU speed have outpaced improvements in main memory and disk access rates by orders of magnitude,enabl ng the use of data compression techniques to mprove the performance of database systems.Prev ous work descr bes the benefits of compression for numer cal attr butes,where data s stored n compressed format on d sk.Despite the abundance of str ng-valued attributes in relat onal schemas there is little work on compression for str ng attr butes n a database context. Moreover,none of the prev ous work suitably addresses the role of the query opt m zer:Dur ng query execut on,data s either eagerly decompressed when t s read nto main memory,or data lazily stays compressed n main memory and s decompressed on demand only. In th s paper,we present an effect ve approach for database compress on based on lightweight,attribute-level compression techniques. We propose a Hierarchical Dictionary Encoding strategy that intelligently selects the most effective compress on method for string-valued attributes. We show that eager and lazy decompression strategies produce sub-optimal plans for queries involving compressed string attributes.We then formalize the problem of compression-aware query optimization and propose one provably optimal and two fast heuristic algorithms for selecting a query plan for relational schemas with compressed attributes; our algorithms can easily be integrated into existing cost-based query optimizers. Experiments using TPC-H data demonstrate the mpact of our string compression methods and show the mportance of compression-aware query optimization. Our approach results n up to an order speed up over existing approaches.


DiSC'02 © 2003 Association for Computing Machinery