SIGMOD Research Highlights are to showcase a set of research projects that exemplify core database research. In particular, these projects address an important problem, represent a definitive milestone in solving the problem, and have the potential of significant impact. SIGMOD Research Highlights also aim to make the selected works widely known in the database community, to our industry partners, and to the broader ACM community.

2024Consent Management in Data Workflows: A Graph Problem
Dorota Filipczuk, Enrico H. Gerding, George Konstantinidis

Better Differentially Private Approximate Histograms and Heavy Hitters using the Misra-Gries Sketch
Christian Janos Lebeda, Jakub Tetek

Allocating Isolation Levels to Transactions in a Multiversion Setting
Brecht Vandevoort, Bas Ketsman, Frank Neven

Extremal Fitting Problems for Conjunctive Queries
Balder Ten Cate, Victor Dalmau, Maurice Funk, Carsten Lutz

Free Join: Unifying Worst-Case Optimal and Traditional Joins
Yisu Remy Wang, Max Willsey, Dan Suciu

LAQy: Efficient & Reusable Query Approximations via Lazy Sampling
Viktor Sanca, Periklis Chrysogelos, Anastasia Ailamaki

Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration Jianhong Tu, Ju Fan, Nan Tang, Peng Wang, Guoliang Li Xiaoyong Du, Xiaofeng Jia, Song Gao

Epistemic Parity: Reproducibility as an Evaluation Metric for Differential Privacy
Lucas Rosenblatt, Bernease Herman, Anastasia Holovenko, Wonkwon Lee, Joshua Loftus, Elizabeth McKinnie, Taras Rumezhak, Andrii Stadnik, Bill Howe, Julia Stoyanovich Xiaoyong Du, Xiaofeng Jia, Song Gao

Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples
Peng Li, Yeye He, Cong Yan, Yue Wang, Surajit Chaudhuri

DBSP: Automatic Incremental View Maintenance for Rich Query Languages
Mihai Budiu, Tej Chajed, Frank McSherry, Leonid Ryzhyk, Val Tannen
2023Ad Hoc Transactions: What They Are and Why We Should Care
Chuzhe Tang, Zhaoguo Wang, Xiaodong Zhang, Qianmian Yu, Binyu Zang, Haibing Guan, and Haibo Chen
Sortledton: a Universal Graph Data Structure
Per Fuchs, Domagoj Margan, and Jana Giceva
Efficiently Making Cross-Engine Transactions Consistent
Jianqiu Zhang, Kaisong Huang, Tianzheng Wang, and King Lv
When is it safe to run a transactional workload under Read Committed?
Brecht Vandevoort, Bas Ketsman, Christoph Koch, and Fank Neven
Building Write-Optimized Tree Indexes on Disaggregated Memory
Qing Wang, Youyou Lu, and Jiwu Shu
Conjunctive Queries with Comparisons
Qichen Wang and Ke Yi
Threshold Queries
Angela Bonifati, Stefania Dumbrava, George Fletcher, Jan Hidders, Matthias Hofer, Wim Martens, Filip Murlak, Joshua Shinavier, Sławek Staworko, and Dominik Tomaszuk
Convergence of Datalog over (Pre-) Semirings
Mahmoud Abo Khamis, Hung Q. Ngo, Reinhard Pichler, Dan Suciu, and Yisu Remy Wang
An Optimal Algorithm for Partial Order Multiway Search
Shangqi Lu, Wim Martens, Matthias Niewerth, and Yufei Tao
Accurate Summary-based Cardinality Estimation Through the Lens of Cardinality Estimation Graphs
Jeremy Chen, Yuqing Huang, Mushi Wang, Semih Salihoglu, and Kenneth Salem
Revisiting Runtime Dynamic Optimization for Join Queries in Big Data Management Systems
Christina Pavlopoulou, Michael J. Carey, and Vassilis J. Tsotras
R2T: Instance-optimal Truncation for Differentially Private Query Evaluation with Foreign Keys
Wei Dong, Juanru Fang, Ke Yi, Yuchao Tao, Ashwin Machanavajjhala
2022Bao: Making Learned Query Optimization Practical
Ryan Marcus, Parimarjan Negi, Hongzi Mao, Nesime Tatbul, Mohammad Alizadeh, and Tim Kraska
DFI: The Data Flow Interface for High-Speed Networks
Lasse Thostrup, Jan Skrzypczak, Matthias Jasny, Tobias Ziegler, and Carsten Binnig
FoundationDB: A Distributed Key Value Store
Jingyu Zhou, Meng Xu, Alexander Shraer, Bala Namasivayam, Alex Miller, Evan Tschannen, Steve Atherton, Andrew J. Beamon, Rusty Sears, John Leach, Dave Rosenthal, Xin Dong, Will Wilson, Ben Collins, David Scherer, Alec Grieser, Young Liu, Alvin Moore, Bhaskar Muppana, Xiaoge Su, and Vishesh Yadav
TURL: Table Understanding through Representation Learning
Xiang Deng, Huan Sun, Alyssa Lees, You Wu, and Cong Yu
No PANE, No Gain: Scaling Attributed Network Embedding in a Single Server
Renchi Yang, Jieming Shi, Xiaokui Xiao, Yin Yang, Sourav S. Bhowmick, and Juncheng Liu
Bipartite Matching: What to do in the Real World When Computing Assignment Costs Dominates Finding the Optimal Assignment
Tenindra Abeywickrama, Victor Liang, and Kian-Lee Tan
Imperative or Functional Control Flow Handling: Why not the Best of Both Worlds?
Gábor E. Gévay, Tilmann Rabl, Sebastian Breß, Loránd Madai-Tahy, Jorge-Arnulfo Quiané-Ruiz, and Volker Markl
Relative Error Streaming Quantiles
Graham Cormode, Zohar Karnin, Edo Liberty, Justin Thaler, and Pavel Veselý
Structure and Complexity of Bag Consistency
Albert Atserias and Phokion G. Kolaiti
Model Counting Meets Distinct Elements in a Data Stream
A. Pavan, N. V. Vinodchandran, Arnab Bhattacharyya, and Kuldeep S. Meel
2021A Framework for Adversarially Robust Streaming Algorithms
Omri Ben-Eliezer, Rajesh Jayaram, David P. Woodruff, Eylon Yogev
Chiller: Contention-centric Transaction Execution and Data Partitioning for Modern Networks
Erfan Zamanian, Julian Shun, Carsten Binnig, Tim Kraska
DIAMetrics: Benchmarking Query Engines at Scale
Shaleen Deep, Anja Gruenheid, Kruthi Nagaraj, Hiro Naito, Jeff Naughton, Stratis Viglas
Efficient Directed Densest Subgraph Discovery
Chenhao Ma, Yixiang Fang, Reynold Cheng, Laks V.S. Lakshmanan, Wenjie Zhang, Xuemin Lin
Fair near neighbor search via sampling
Martin Aumüller, Sariel Har-Peled, Sepideh Mahabadi, Rasmus Pagh
From Sketching to Natural Language: Expressive Visual Querying for Accelerating Insight
Tarique Siddiqui, Paul Luh, Zesheng Wang, Karrie Karahalios, Aditya G. Parameswaran
Optimistically Compressed Hash Tables & Strings in the USSR
Tim Gubner, Viktor Leis, Peter Boncz
Probabilistic Data with Continuous Distributions
Martin Grohe, Benjamin Lucien, Joost-Pieter Katoen, Peter Lindner
Query Games in Databases
Ester Livshits, Leopoldo Bertossi, Benny Kimelfeld, Moshe Sebag
Scaling Dynamic Hash Tables on Real Persistent Memory
Baotong Lu, Xiangpeng Hao, Tianzheng Wang, Eric Lo
2020Checking Invariant Confluence, In Whole or In Parts
Michael Whittaker, Joseph M. Hellerstein
Concurrent Prefix Recovery: Performing CPR on a Database
Guna Prasaad, Badrish Chandramouli, Donald Kossmann
Constant-Delay Enumeration for Nondeterministic Document Spanners
Antoine Amarilli, Pierre Bourhis, Stefan Mengel, Matthias Niewerth
Database Repair Meets Algorithmic Fairness
Babak Salimi, Bill Howe, Dan Suciu
Declarative Recursive Computation on an RDBMS or, Why You Should Use a Database For Distributed Machine Learning
Dimitrije Jankov, Shangyu Luo, Binhang Yuan, Zhuhua Cai, Jia Zou, Chris Jermaine, Zekai J. Gao
Efficient Logspace Classes for Enumeration, Counting, and Uniform Generation
Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, Cristian Riveros
Query Optimization for Faster Deep CNN Explanations
Supun Nakandala, Arun Kumar, and Yannis Papakonstantinou
Revealing Every Story of Data in Blockchain Systems
Pingcheng Ruan, Tien Tuan Anh Dinh, Qian Lin, Meihui Zhang, Gang Chen, Beng Chin Ooi
2019Succinct Range Filters
Huanchen Zhang, Hyeontaek Lim, Viktor Leise, David G. Andersen, Michael Kaminsky, Kimberly Keeton and Andrew Pavlo
Online Model Management via Temporally Biased Sampling
Brian Hentschel, Peter J. Haas and Yuanyuan Tian
MATLANG: Matrix operations and their expressive power
Robert Brijder, Floris Geerts, Jan Van den Bussche and Timmy Weerwag
How Do Humans and Data Systems Establish a Common Query Language?
Ben McCamish, Vahid Ghadakchi, Arash Termehchy, Liang Huang and Behrouz Touri
Efficient Signal Reconstruction for a Broad Range of Applications
Abolfazl Asudeh, Jees Augustine, Azade Nazi, Saravanan Thirumuruganathan, Nan Zhangk, Gautam Das and Divesh Srivastava
Efficient Query Processing for Dynamically Changing Datasets
Muhammad Idris, Martín Ugarte, Stijn Vansummeren, Hannes Voigt and Wolfgang Lehner
Entity Matching with Quality and Error Guarantees
Yufei Tao
εKTELO: A Framework for Defining Differentially-Private Computations
Dan Zhang, Ryan McKenna, os Kotsogiannis, George Bissias, Michael Hay, Ashwin Machanavajjhala and Gerome Miklau
Bridging Theory and Practice with Query Log Analysis
Wim Martens, Tina Trautner
2018Natural Language Explanations for Query Results
Daniel Deutch, Nave Frost, Amir Gilad
Magellan: Toward Building Entity Matching Management Systems
Pradap Konda, Sanjib Das, Paul Suganthan G.C., Philip Martinkus, AnHai Doan, Adel Ardalan, Jeffrey R. Ballard, Yash Govind, Han Li, Fatemah Panahi, Haojun Zhang, Jeff Naughton, Shishir Prasad, Ganesh Krishnan, Rohit Deep, Vijay Raghavendra
Scalable Linear Algebra on a Relational Database System
Shangyu Luo, Zekai J. Gao, Michael Gubanov, Luis L. Perez, Christopher Jermaine
From Think Parallel to Think Sequential
Wenfei Fan, Yang Cao, Jingbo Xu, Wenyuan Yu, Yinghui Wu, Chao Tian, Jiaxin Jiang, Bohan Zhang
A Relational Framework for Classifier Engineering
Benny Kimelfeld, Christopher R´e
2017Scaling Machine Learning via Compressed Linear Algebra
Ahmed Elgohary, Matthias Boehm, Peter J. Haas, Frederick R. Reiss, Berthold Reinwald
Wander Join and XDB: Online Aggregation via Random Walks
Feifei Li , Bin Wu, Ke Yi, Zhuoyue Zhao
A Scalable Execution Engine for Package Queries
Matteo Brucato, Azza Abouzied, Alexandra Meliou
Optimizing Tree Patterns for Querying Graph- and Tree-Structured Data
Wojciech Czerwiński, Wim Martens, Matthias Niewerth, Paweł Parys
Juggling Functions Inside a Database
Mahmoud Abo Khamis, Hung Q. Ngo, Atri Rudra
2016Multi-Objective Parametric Query Optimization
Immanuel Trummer, Christoph Koch
Data partitioning for single-round multi-join evaluation in massively parallel system
Tom J. Ameloot, Gaetano Geck, Bas Ketsman, Frank Neven, Thomas Schwentick
Resource Bricolage for Parallel DBMSs on Heterogeneous Clusters
Jiexing Li, Jeffrey Naughton, Rimma V. Nehme
Implicit Parallelism through Deep Language Embedding
Alexander Alexandrov, Asterios Katsifodimos, Georgi Krastev, Volker Markl
DeepDive: Declarative Knowledge Base Construction
Christopher De Sa, Alex Ratner, Christopher Ré, Jaeho Shin, Feiran Wang, Sen Wu, Ce Zhang
k-Shape: Efficient and Accurate Clustering of Time Series
John Paparrizos, Luis Gravano
Understanding Natural Language Queries over Relational Databases
Fei Li, H. V. Jagadish