Welcome to D
SIGMOD 2003
PODS 2003
SIGMOD-RECOR
ADBIS
CIDR 2003
CIKM 2003
DASFAA 2003
Data Enginee
DEBS
DMKD 2003
DOLAP 2003
DPDJ 2003
ER
GIS 2003
Hypertext 20
ICDE 2003
ICDM 2003
ICDT 2003
JCDL 2003
KRDB 2003
MIR 2003
MIS 2003
MMDB 2003
RIDE 2003
SBBD 2003
<<< = SBBD Papers>>>
SIGIR 2003
SIGIR-FORUM
SIGKDD 2003
SIGKDD-EXP
SSDBM 2003
TIME 2003
TODS
VLDB 2003
VLDB Journal
WIDM 2003

Implementing and Evaluating Warehouses and Summaries over a Cluster


Pedro Furtado

  View Paper (PDF)  

Return to Data Warehousing Techniques and Applications


Abstract

Cluster computation power provides a promising way to improve response time in large data warehouses On the other hand, the use of sampling summaries on the cluster for approximate answering of OLAP queries provides a very flexible system that can provide response time guarantees. In this paper we explore the cluster computation paradigm for data warehouses and summaries. The use of cluster computation in a network with N computers can speedup query processing about N times and further speedup can be obtained using samples instead of the full data. Sampling summaries have been proposed before in the context of OLAP queries to avoid query processing times that leave users and applications waiting too long when only exploration analysis is required over more or less aggregated data. But while a typical one-node sampling summary is either too small to answer more detailed queries or too slow to provide almost instant response time, summaries over a cluster are extremely fast and are sufficiently large to answer most aggregation query patterns. We explore the implementation and processing of the data warehouse and sampling summaries over a set of nodes for cooperative cluster computing and present experimental results on the subject.


©2004 Association for Computing Machinery