2003 Digital Symposium Collection

Efficient Query Processing for Multi-Dimensionally Clustered Tables in DB2

Bishwaranjan Bhattacharjee, Sriram Padmanabhan, Timothy Malkemus, Tony Lai, Leslie Cranston, and Matthew Huras
View Paper (PDF)

Return to Multidimensionality & Bioinformatics (Session C4)

Abstract

We have introduced a Multi-Dimensional Clustering (MDC) physical layout scheme in DB2 version 8.0 for relational tables. Multi- Dimensional Clustering is based on the def- inition of one or more orthogonal clustering attributes (or expressions) of a table. The table is organized physically by associating records with similar values for the dimension attributes in a cluster. Each clustering key is allocated one or more blocks of physical storage with the aim of storing the multiple records belonging to the cluster in almost con- tiguous fashion. Block oriented indexes are created to access these blocks. In this pa- per, we describe novel techniques for query processing operations that provide signi cant performance improvements for MDC tables. Current database systems employ a repertoire of access methods including table scans, index scans, index ANDing, and index ORing. We have extended these access methods for e - ciently processing the block based MDC ta- bles. One important concept at the core of processing MDC tables is the block oriented access technique. In addition, since MDC ta- bles can include regular record oriented in- dexes, we employ novel techniques to combine block and record indexes. Block oriented pro- cessing is extended to nested loop joins and star joins as well. We show results from ex- periments using a star-schema database to val- idate our claims of performance with minimal overhead.