Executing SQL over Encrypted Data in the Database-Service-Provider Model

Hakan Hacigumus, Bala Iyer, Chen Li, Sharad Mehrotra

This paper from the SIGMOD 2002 Conference remarkably anticipated the world of “Database as Service” which did come about and continues to grow in importance. To get a sense of how visionary the work was, consider that this paper was published in June 2002 (and thus accepted in Jan 2002), even a couple of months before Amazon EC2 and S3 services were launched (of course, Amazon RDS and SQL Azure came much later). The core of the paper focuses on the challenges of how to leverage cloud services while keeping some of the information (at the discretion of the enterprise/user) hidden from the service provider. Beyond the specific algorithmic details, the key contribution is the framework: (i) introduction of a mapping function, and (ii) query splitting logic to ensure how the work can be distributed across cloud and client when some information is encrypted. Is this framework used by enterprises today? As best as we can tell, the answer is perhaps no. But, is the framework interesting and has real possibilities of adoption and further impact and more follow-on by research community? Absolutely. In summary, this paper is one of the early papers to foresee the world of Database as Service (before any one of us were working on that problem). The specific technical focus was dealt with reasonable depth. The impact of the technical focus has not yet been seen by the industry but this paper has the possibility of inspiring much more follow-on work/thinking (beyond 140+ citations it already has in ACM Digital Library).

nullHakan Hacigumus is the head of Data Management Research at NEC Labs America. His current interests include datamanagement in the cloud, big data, data analytics, mobility, andservice oriented business models. Prior to NEC Labs, he was a researcher at IBM Almaden Research Center, where worked on a wide range of areas in data management and services research. He received his Ph.D. in Computer Science from the University of California, Irvine.

nullBalakrishna (Bala) Iyer works for IBM as a Distinguished Engineer for Database Technology. He earned his B.Tech from IIT -Bombay, MS and PhD degrees from Rice University. He has worked previously for Bell Labs, Murray Hill, NJ. Bala has made contributions to the field of database in the area of temporal data, database as a service compression, sorting, query processing, data mining, encoded vector representation and processing. Many of his innovation are used every day, having been incorporated in IBM’s data management products like VSAM, IMS, DB2 and IBM Intelligent Miner, and products from other leading vendors. His work on the temporal data model led to the standardization of temporal function in SQL 2011.

nullChen Li is an associate professor in the Department of Computer Science at the University of California, Irvine. He received his Ph.D.degree in Computer Science from Stanford University in 2001, and his M.S. and B.S. in Computer Science from Tsinghua University, China, in 1996 and 1994, respectively. He received a National Science Foundation CAREER Award in 2003 and many other NSF grants and industry gifts. He was once a part-time Visiting Research Scientist at Google. His research interests are in the fields of data management and information search, including text search, data-intensive computing, and data integration. He is the founder of Bimaple Technology Inc., a company providing powerful search for enterprises and developers.

nullSharad Mehrotra is a Professor in the School of Information and Computer Science at University of California, Irvine and founding Director of the Center for Emergency Response Technologies (CERT) at UCI. From 2002-2009 he served as the Director and PI of the RESCUE project (Responding to Crisis and Unexpected Events) which, funded by NSF through its large ITR program, spanned 7 schools and consisted of 60 members. He is the recipient of Outstanding Graduate Student Mentor Award in 2005. Prior to joining UCI, he was a member of the faculty at University of Illinois, Urbana Champaign in the Department of Computer Science where he was the recipient of the C. W. Gear Outstanding Junior Faculty Award. Mehrotra has also served as a Scientist at Matsushita Information Technology Laboratory immediately after graduating with a Ph.D. from University of Texas at Austin (1988-1993). Mehrotra’s research expertise is in data management and distributed systems areas in which he has made many pioneering contributions. Two such contributions include the concept of “database as a service” and “use of information retrieval techniques, particularly relevance feedback, in multimedia search”. Mehrotra is a recipient of numerous best paper nominations and awards includingSIGMOD Best Paper award in 2001 for a paper entitled “Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases”, and best paper award in DASFAA 2004 for the paper entitled “Efficient Execution of Aggregation Queries over Encrypted Databases”. Another of his paper entitled “Concurrency Control in Hierarchical Multidatabase System” was selected as best of VLDB 1994 submissions invited for the VLDB Journal. Mehrotra’s recent research focuses on data quality, data privacy particularly in the context of cloud computing and sensor driven situational awareness systems.