PODS Keynotes and Speakers
Keynote 1: Management of Probabilistic Data: Foundations and Challenges
Dan Suciu, University of Washington, Seattle, WA, USA
Many applications today need to manage large volumes of uncertain data, and represent the degree of uncertainty as probabilities. Examples include fuzzy object
matching, uncertain schema mappings, exploratory queries in databases, RFID and sensor data. This talk discusses the probabilistic data model and the complexity of evaluating queries over disjoint/independent probabilistic databases: for some queries the exact probabilities can be computed efficiently, even using an existing relational database engine, while for other queries the exact probabilities are hard to compute, but can be approximated with reasonable efficiency. The management of probabilistic data raises many interesting and difficult problems combining logic with probability theory: we will discuss some of these problems and suggest future research directions.
Dan Suciu is a professor at the University of Washington. He received his Ph.D. from the University of Pennsylvania in 1995, then was a principal member of the technical staff at AT&T Labs until he joined the University of Washington in 2000. Suciu is conducting research in data management, with an emphasis on topics that arise from sharing data on the Internet, such as management of semistructured and heterogeneous data, data security, and managing data with uncertainties. He is a co-author of the book Data on the Web: from Relations to Semistructured Data and XML, holds six US patents, received the 2000 ACM SIGMOD Best Paper Award, is a recipient of the NSF Career Award and of an Alfred P. Sloan Fellowship.
Invited Tutorial 1: A crash course on database queries
Jan Van den Bussche, University of Hasselt, Belgium
Complex database queries, like general programs, can "crash", i.e., can raise runtime errors. We want to avoid crashes without losing expressive power, or we want to correctly predict the absence of crashes. We show how concepts and techniques from programming language theory, notable type systems and reflection, can be adapted to this end. Of course, the specific nature of database queries (as opposed to general programs), also requires some new methods, and raises new questions.
Jan Van den Bussche received his PhD from the University of Antwerp, Belgium, in 1993, with Jan Paredaens as advisor. He is currently a professor at the University of Hasselt, Belgium. His research activities are centered around the theory of database systems, broadly construed. His current research interests include database query languages; query processing; data mining; workflows; and DNA computing. He chaired the program committee of ICDT 2001 (with Victor Vianu) and of PODS 2006, and currently he serves as chair of the ICDT council (International Conference on Database Theory).
Invited Tutorial 2: Machine models and lower bounds for query processing
Nicole Schweikardt, Humboldt-University Berlin, Germany
Two different scenarios for the processing of massive amounts of data have been extensively studied in recent years:
1. External memory processing: This scenario considers data that is stored in external memory. When processing such data, the resulting input/output communication between fast internal memory and slower external memory is a major performance bottleneck. There has been a wealth of research on the design and analysis of so-called external memory algorithms which aim at optimizing the costs produced by external memory accesses.
2. Data stream processing: This scenario considers data that is not stored but, instead, arrives as a stream and has to be processed on-the-fly by using only a limited amount of memory. Typical application areas for which data stream processing is relevant are, e.g., IP network traffic analysis or processing meteorological data generated by sensor networks.
This talk will give an overview of some recent work concerning the theoretical foundations of these two scenarios. The main focus is on generalizations of the classical data stream model where, apart from an internal memory of limited size, also a number of (potentially huge) streams may be used as external memory devices.
Nicole Schweikardt received her PhD from the Johannes Gutenberg-University in Mainz, Germany, in 2002. She was a postdoctoral researcher at the University of Edinburgh, U.K. Currently she is an assistant professor at Humboldt-University Berlin, Germany. Her current research interests are in logic, database theory, and complexity theory; in particular in the complexity of processing massive datasets and data streams, efficient query evaluation, and the expressivity and complexity of query languages and logics. Her work on the complexity of processing massive data sets received the best paper award of the 32nd International Colloquium on Automata, Languages, and Programming (ICALP 2005, jointly with Martin Grohe and Christoph Koch) and the Heinz Maier-Leibnitz Prize, a young investigators award by the German Research Foundation (DFG) and the German Federal Ministry of Education and Research (BMBF).
Nicole Schweikardt currently serves as a member of the ICDT council (International Conference on Database Theory) and as LICS Publicity Co-Chair (IEEE Symposium on Logic in Computer Science). She is a selected member of the Young Academy, a joint project of Germany's two oldest scientific academies.