Masaru Kitsuregawa


2009 SIGMOD Edgar F. Codd Innovations Award

Masaru Kitsuregawa is the recipient of the 2009 SIGMOD Edgar F. Codd Innovations Award for contributions to high-performance database technology.

Kitsuregawa made major contributions to the development of hash-join algorithms, which significantly improved the performance of join operations in relational database systems. That work has influenced related research in areas such as query execution, plan optimization and dynamic query-workload balancing, as well as the development of commercial database products. He implemented the hash-based approach on a variety of platforms, including the Functional Disk System and multi-node PC clusters, demonstrating its substantial advantages through detailed evaluations. He has also applied hash-based strategies to parallel association mining and showed its effectiveness there. His contributions in the hardware area include a high-speed sorting system with a sophisticated memory management algorithm. That work was eventually commercialized in collaboration with colleagues, and won the Datamation sort benchmark in 2000.


Professor Kitsuregawa has contributed extensively to the area of high-performance database systems, particularly involving hash-based methods. The work began in the early 1980’s in the context of the GRACE relational-database machine. He is particularly known for his work on hash-based join algorithms, which is still widely cited. By the late 1980’s and early 1990’s, others had built on that work to develop various hybrid versions of hash join, and most database conference of that time had sessions devoted to the topic. His own refinements include dynamic destaging and bucket tuning. At the time, most commercial relational-database products used only looping and sort-based joins. Nearly all current system include hash-based join implementations. He also contributed hash-based approaches to aggregation operations.

He went on to implement the Functional Disk System (FDS), a parallel, hash-based relational system with a shared-memory architecture. He demonstrated that efficient parallel execution of relational operations was possible with hash-based methods, showing substantial performance improvement on the Wisconsin Benchmark. He also developed database-engine software for a shared-nothing architecture on a 100-node PC cluster, which was evaluated against other systems on the TPC-D Benchmark in the late 1990s. He is among the first to apply hash-based approaches to parallel data mining.