Serializable isolation for snapshot databases

Michael Cahill, Uwe Röhm, Alan Fekete

Many popular database management systems implement a multiversion concurrency control algorithm called snapshot isolation rather than providing full serializability based on locking. There are well-known anomalies permitted by snapshot isolation that can lead to violations of data consistency by interleaving transactions that would maintain consistency if run serially. Until now, the only way to prevent these anomalies was to modify the applications by introducing explicit locking or artificial update conflicts, following careful analysis of conflicts between all pairs of transactions.This article describes a modification to the concurrency control algorithm of a database management system that automatically detects and prevents snapshot isolation anomalies at runtime for arbitrary applications, thus providing serializable isolation. The new algorithm preserves the properties that make snapshot isolation attractive, including that readers do not block writers and vice versa. An implementation of the algorithm in a relational DBMS is described, along with a benchmark and performance study, showing that the throughput approaches that of snapshot isolation in most cases.

Dr Michael Cahill is the VP Engineering (Storage) at MongoDB. He was a founder and architect at WiredTiger before its acquisition by MongoDB. Prior to that, he was an architect of Berkeley DB at Sleepycat Software and Oracle Corp., responsible for design and implementation of multiversion concurrency control, as well as SQL interfaces and programming language APIs. Previously, Dr. Cahill was CTO at Bullant Technology, which grew tenfold and raised over US$30 million from investors including Intel Capital and JP Morgan during his three year tenure. Dr. Cahill’s PhD from the University of Sydney is in the area of transaction processing and concurrency control. His work on a new algorithm for implementing serializable isolation received an ACM SIGMOD Best Paper award in 2008 and was added to PostgreSQL 9.1.
Uwe Röhm is an Associate Professor for data engineering at the University of Sydney, School of Information Technologies. He developed freshness-aware scheduling (FAS), a novel approach to replication management in massive clusters that allows clients to trade data freshness for query performance. Associate Professor Röhm completed a Ph.D. at ETH Zurich in 2002. He then worked as an IT consultant in Germany before joining the University of Sydney in 2004. He has been a visiting researcher with Microsoft’s SQL Server group and is Chief Investigator on an ARC-funded Discovery Project and also on a Linkage Project funded by ARC and Microsoft.
Alan Fekete is Professor of Enterprise Software Systems at the University of Sydney in Australia. He completed a BSc from University of Sydney, then a PhD from Harvard University, and he has held visiting positions at Cornell, MIT, U Washington, Microsoft Research, and UC Berkeley. Alan has an extensive history of work on the topic of transaction management. His research focus is currently on how to help application programmers understand and exploit the properties and performance of data management services that offer lower support for consistency or isolation. He also has published on theory of distributed computing (especially consensus techniques), formal methods for software engineering, and computing education. Alan has been recognized as ACM Distinguished Scientist.