SIGMOD Research Highlight Awards

SIGMOD Research Highlights are to showcase a set of research projects that exemplify core database research. In particular, these projects address an important problem, represent a definitive milestone in solving the problem, and have the potential of significant impact. SIGMOD Research Highlights also aim to make the selected works widely known in the database community, to our industry partners, and to the broader ACM community.

2026	DPconv: Super-Polynomially Faster Join Ordering Mihail Stoian, Andreas Kipf Automating Vectorized Distributed Graph Computation Wenyue Zhao, Yang Cao, Peter Buneman, Jia Li, Nikos Ntarmos AnyBlox: A Framework for Self-Decoding Datasets Mateusz Gienieczko, Maximilian Kuschewski, Thomas Neumann, Viktor Leis, Jana Giceva Rel: A Programming Language for Relational Data Molham Aref, Paolo Guagliardo, George Kastrinis, Leonid Libkin, Victor Marsault, Wim Martens, Mary McGrath, Filip Murlak, Nathaniel Nystrom, Liat Peterfreund, Allison Rogers, Cristina Sirangelo, Domagoj Vrgoč, David Zhao, Abdul Zreika MEMPHIS: Holistic Lineage-based Reuse and Memory Management for Multi-backend ML Systems Arnab Phani, Matthias Boehm Diva: Dynamic Range Filter for Var-Length Keys and Queries Navid Eslami, Ioana O. Bercea, Niv Dayan The Key to Effective UDF Optimization: Before Inlining, First Perform Outlining Samuel Arch, Yuchen Liu, Todd C. Mowry, Jignesh M. Patel, Andrew Pavlo Output-sensitive Conjunctive Query Evaluation Shaleen Deep, Hangdong Zhao, Austen Z. Fan, Paraschos Koutris Output-Optimal Algorithms for Join-Aggregate Queries Xiao Hu Differentially Private Substring and Document Counting Giulia Bernardini, Philip Bille, Inge Li Gørtz, Teresa Anna Steiner
2025	Join Size Bounds using lp-Norms on Degree Sequences Mahmoud Abo Khamis, Vasileios Nakos, Dan Olteanu, Dan Suciu History-Independent Dynamic Partitioning: Operation-OrderPrivacy in Ordered Data Structures Michael A. Bender, Martín Farach-Colton, Michael T. Goodrich, Hanna Komlós OmniSketch: Streaming Data Analytics with Arbitrary Predicates Wieger R. Punter, Odysseas Papapetrou, Minos Garofalakis BOSS – An Architecture for Database Kernel Composition Hubert Mohr-Daurat, Xuan Sun, Holger Pirk CausalMesh: A Causal Cache for Stateful Serverless Computing Haoran Zhang, Shuai Mu, Sebastian Angel, and Vincent Liu Implementing Views for Property Graphs Soonbo Han and Zachary G. Ives Reservoir Sampling over Joins Binyang Dai, Xiao Hu, and Ke Yi On The Reasonable Effectiveness of Relational Diagrams: Explaining Relational Query Patterns and the Pattern Expressiveness of Relational Languages Wolfgang Gatterbauer, Cody Dunne Repairing Raw Data Files with TASHEEH Mazhar Hameed, Gerardo Vitagliano, Fabian Panse, and Felix Naumann GPTuner: An LLM-Based Database Tuning System Jiale Lao, Yibo Wang, Yufei Li, Jianping Wang, Yunjia Zhang, Zhiyuan Cheng, Wanghu Chen, Mingjie Tang, and Jianguo Wang
2024	Consent Management in Data Workflows: A Graph Problem Dorota Filipczuk, Enrico H. Gerding, George Konstantinidis Better Differentially Private Approximate Histograms and Heavy Hitters using the Misra-Gries Sketch Christian Janos Lebeda, Jakub Tetek Allocating Isolation Levels to Transactions in a Multiversion Setting Brecht Vandevoort, Bas Ketsman, Frank Neven Extremal Fitting Problems for Conjunctive Queries Balder Ten Cate, Victor Dalmau, Maurice Funk, Carsten Lutz Free Join: Unifying Worst-Case Optimal and Traditional Joins Yisu Remy Wang, Max Willsey, Dan Suciu LAQy: Efficient & Reusable Query Approximations via Lazy Sampling Viktor Sanca, Periklis Chrysogelos, Anastasia Ailamaki Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration Jianhong Tu, Ju Fan, Nan Tang, Peng Wang, Guoliang Li Xiaoyong Du, Xiaofeng Jia, Song Gao Epistemic Parity: Reproducibility as an Evaluation Metric for Differential Privacy Lucas Rosenblatt, Bernease Herman, Anastasia Holovenko, Wonkwon Lee, Joshua Loftus, Elizabeth McKinnie, Taras Rumezhak, Andrii Stadnik, Bill Howe, Julia Stoyanovich Xiaoyong Du, Xiaofeng Jia, Song Gao Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples Peng Li, Yeye He, Cong Yan, Yue Wang, Surajit Chaudhuri DBSP: Automatic Incremental View Maintenance for Rich Query Languages Mihai Budiu, Tej Chajed, Frank McSherry, Leonid Ryzhyk, Val Tannen
2023	Ad Hoc Transactions: What They Are and Why We Should Care Chuzhe Tang, Zhaoguo Wang, Xiaodong Zhang, Qianmian Yu, Binyu Zang, Haibing Guan, and Haibo Chen Sortledton: a Universal Graph Data Structure Per Fuchs, Domagoj Margan, and Jana Giceva Efficiently Making Cross-Engine Transactions Consistent Jianqiu Zhang, Kaisong Huang, Tianzheng Wang, and King Lv When is it safe to run a transactional workload under Read Committed? Brecht Vandevoort, Bas Ketsman, Christoph Koch, and Fank Neven Building Write-Optimized Tree Indexes on Disaggregated Memory Qing Wang, Youyou Lu, and Jiwu Shu Conjunctive Queries with Comparisons Qichen Wang and Ke Yi Threshold Queries Angela Bonifati, Stefania Dumbrava, George Fletcher, Jan Hidders, Matthias Hofer, Wim Martens, Filip Murlak, Joshua Shinavier, Sławek Staworko, and Dominik Tomaszuk Convergence of Datalog over (Pre-) Semirings Mahmoud Abo Khamis, Hung Q. Ngo, Reinhard Pichler, Dan Suciu, and Yisu Remy Wang An Optimal Algorithm for Partial Order Multiway Search Shangqi Lu, Wim Martens, Matthias Niewerth, and Yufei Tao Accurate Summary-based Cardinality Estimation Through the Lens of Cardinality Estimation Graphs Jeremy Chen, Yuqing Huang, Mushi Wang, Semih Salihoglu, and Kenneth Salem Revisiting Runtime Dynamic Optimization for Join Queries in Big Data Management Systems Christina Pavlopoulou, Michael J. Carey, and Vassilis J. Tsotras R2T: Instance-optimal Truncation for Differentially Private Query Evaluation with Foreign Keys Wei Dong, Juanru Fang, Ke Yi, Yuchao Tao, Ashwin Machanavajjhala
2022	Bao: Making Learned Query Optimization Practical Ryan Marcus, Parimarjan Negi, Hongzi Mao, Nesime Tatbul, Mohammad Alizadeh, and Tim Kraska DFI: The Data Flow Interface for High-Speed Networks Lasse Thostrup, Jan Skrzypczak, Matthias Jasny, Tobias Ziegler, and Carsten Binnig FoundationDB: A Distributed Key Value Store Jingyu Zhou, Meng Xu, Alexander Shraer, Bala Namasivayam, Alex Miller, Evan Tschannen, Steve Atherton, Andrew J. Beamon, Rusty Sears, John Leach, Dave Rosenthal, Xin Dong, Will Wilson, Ben Collins, David Scherer, Alec Grieser, Young Liu, Alvin Moore, Bhaskar Muppana, Xiaoge Su, and Vishesh Yadav TURL: Table Understanding through Representation Learning Xiang Deng, Huan Sun, Alyssa Lees, You Wu, and Cong Yu No PANE, No Gain: Scaling Attributed Network Embedding in a Single Server Renchi Yang, Jieming Shi, Xiaokui Xiao, Yin Yang, Sourav S. Bhowmick, and Juncheng Liu Bipartite Matching: What to do in the Real World When Computing Assignment Costs Dominates Finding the Optimal Assignment Tenindra Abeywickrama, Victor Liang, and Kian-Lee Tan Imperative or Functional Control Flow Handling: Why not the Best of Both Worlds? Gábor E. Gévay, Tilmann Rabl, Sebastian Breß, Loránd Madai-Tahy, Jorge-Arnulfo Quiané-Ruiz, and Volker Markl Relative Error Streaming Quantiles Graham Cormode, Zohar Karnin, Edo Liberty, Justin Thaler, and Pavel Veselý Structure and Complexity of Bag Consistency Albert Atserias and Phokion G. Kolaiti Model Counting Meets Distinct Elements in a Data Stream A. Pavan, N. V. Vinodchandran, Arnab Bhattacharyya, and Kuldeep S. Meel
2021	A Framework for Adversarially Robust Streaming Algorithms Omri Ben-Eliezer, Rajesh Jayaram, David P. Woodruff, Eylon Yogev Chiller: Contention-centric Transaction Execution and Data Partitioning for Modern Networks Erfan Zamanian, Julian Shun, Carsten Binnig, Tim Kraska DIAMetrics: Benchmarking Query Engines at Scale Shaleen Deep, Anja Gruenheid, Kruthi Nagaraj, Hiro Naito, Jeff Naughton, Stratis Viglas Efficient Directed Densest Subgraph Discovery Chenhao Ma, Yixiang Fang, Reynold Cheng, Laks V.S. Lakshmanan, Wenjie Zhang, Xuemin Lin Fair near neighbor search via sampling Martin Aumüller, Sariel Har-Peled, Sepideh Mahabadi, Rasmus Pagh From Sketching to Natural Language: Expressive Visual Querying for Accelerating Insight Tarique Siddiqui, Paul Luh, Zesheng Wang, Karrie Karahalios, Aditya G. Parameswaran Optimistically Compressed Hash Tables & Strings in the USSR Tim Gubner, Viktor Leis, Peter Boncz Probabilistic Data with Continuous Distributions Martin Grohe, Benjamin Lucien, Joost-Pieter Katoen, Peter Lindner Query Games in Databases Ester Livshits, Leopoldo Bertossi, Benny Kimelfeld, Moshe Sebag Scaling Dynamic Hash Tables on Real Persistent Memory Baotong Lu, Xiangpeng Hao, Tianzheng Wang, Eric Lo
2020	Checking Invariant Confluence, In Whole or In Parts Michael Whittaker, Joseph M. Hellerstein Concurrent Prefix Recovery: Performing CPR on a Database Guna Prasaad, Badrish Chandramouli, Donald Kossmann Constant-Delay Enumeration for Nondeterministic Document Spanners Antoine Amarilli, Pierre Bourhis, Stefan Mengel, Matthias Niewerth Database Repair Meets Algorithmic Fairness Babak Salimi, Bill Howe, Dan Suciu Declarative Recursive Computation on an RDBMS or, Why You Should Use a Database For Distributed Machine Learning Dimitrije Jankov, Shangyu Luo, Binhang Yuan, Zhuhua Cai, Jia Zou, Chris Jermaine, Zekai J. Gao Efficient Logspace Classes for Enumeration, Counting, and Uniform Generation Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, Cristian Riveros Query Optimization for Faster Deep CNN Explanations Supun Nakandala, Arun Kumar, and Yannis Papakonstantinou Revealing Every Story of Data in Blockchain Systems Pingcheng Ruan, Tien Tuan Anh Dinh, Qian Lin, Meihui Zhang, Gang Chen, Beng Chin Ooi
2019	Succinct Range Filters Huanchen Zhang, Hyeontaek Lim, Viktor Leise, David G. Andersen, Michael Kaminsky, Kimberly Keeton and Andrew Pavlo Online Model Management via Temporally Biased Sampling Brian Hentschel, Peter J. Haas and Yuanyuan Tian MATLANG: Matrix operations and their expressive power Robert Brijder, Floris Geerts, Jan Van den Bussche and Timmy Weerwag How Do Humans and Data Systems Establish a Common Query Language? Ben McCamish, Vahid Ghadakchi, Arash Termehchy, Liang Huang and Behrouz Touri Efficient Signal Reconstruction for a Broad Range of Applications Abolfazl Asudeh, Jees Augustine, Azade Nazi, Saravanan Thirumuruganathan, Nan Zhangk, Gautam Das and Divesh Srivastava Efficient Query Processing for Dynamically Changing Datasets Muhammad Idris, Martín Ugarte, Stijn Vansummeren, Hannes Voigt and Wolfgang Lehner Entity Matching with Quality and Error Guarantees Yufei Tao εKTELO: A Framework for Defining Differentially-Private Computations Dan Zhang, Ryan McKenna, os Kotsogiannis, George Bissias, Michael Hay, Ashwin Machanavajjhala and Gerome Miklau Bridging Theory and Practice with Query Log Analysis Wim Martens, Tina Trautner
2018	Natural Language Explanations for Query Results Daniel Deutch, Nave Frost, Amir Gilad Magellan: Toward Building Entity Matching Management Systems Pradap Konda, Sanjib Das, Paul Suganthan G.C., Philip Martinkus, AnHai Doan, Adel Ardalan, Jeffrey R. Ballard, Yash Govind, Han Li, Fatemah Panahi, Haojun Zhang, Jeff Naughton, Shishir Prasad, Ganesh Krishnan, Rohit Deep, Vijay Raghavendra Scalable Linear Algebra on a Relational Database System Shangyu Luo, Zekai J. Gao, Michael Gubanov, Luis L. Perez, Christopher Jermaine From Think Parallel to Think Sequential Wenfei Fan, Yang Cao, Jingbo Xu, Wenyuan Yu, Yinghui Wu, Chao Tian, Jiaxin Jiang, Bohan Zhang A Relational Framework for Classifier Engineering Benny Kimelfeld, Christopher R´e
2017	Scaling Machine Learning via Compressed Linear Algebra Ahmed Elgohary, Matthias Boehm, Peter J. Haas, Frederick R. Reiss, Berthold Reinwald Wander Join and XDB: Online Aggregation via Random Walks Feifei Li , Bin Wu, Ke Yi, Zhuoyue Zhao A Scalable Execution Engine for Package Queries Matteo Brucato, Azza Abouzied, Alexandra Meliou Optimizing Tree Patterns for Querying Graph- and Tree-Structured Data Wojciech Czerwiński, Wim Martens, Matthias Niewerth, Paweł Parys Juggling Functions Inside a Database Mahmoud Abo Khamis, Hung Q. Ngo, Atri Rudra
2016	Multi-Objective Parametric Query Optimization Immanuel Trummer, Christoph Koch Data partitioning for single-round multi-join evaluation in massively parallel system Tom J. Ameloot, Gaetano Geck, Bas Ketsman, Frank Neven, Thomas Schwentick Resource Bricolage for Parallel DBMSs on Heterogeneous Clusters Jiexing Li, Jeffrey Naughton, Rimma V. Nehme Implicit Parallelism through Deep Language Embedding Alexander Alexandrov, Asterios Katsifodimos, Georgi Krastev, Volker Markl DeepDive: Declarative Knowledge Base Construction Christopher De Sa, Alex Ratner, Christopher Ré, Jaeho Shin, Feiran Wang, Sen Wu, Ce Zhang k-Shape: Efficient and Accurate Clustering of Time Series John Paparrizos, Luis Gravano Understanding Natural Language Queries over Relational Databases Fei Li, H. V. Jagadish