ACM SIGMOD City, Country, Year
sigmod pods logo

PODS Invited Talks

Keynote

Datalog Redux: Experience and Conjecture

Jospeh M. Hellerstein, UC Berkeley

Abstract

Datalog was a foundational topic in the early years of PODS, despite skepticism from practitioners about its relevance. This has been changing in recent years, with unlikely champions exploring and promoting its use as a practical basis for programming in a wide variety of application domains. We reflect on our use of Datalog to build systems of significant complexity for both networking and cloud computing infrastructure. Based on that experience, we present conjectures regarding next-generation programming languages, and the role that database theory could play in their development.

Invited Tutorial 1

From Information to Knowledge: Harvesting Entities and Relationships from Web Sources

Gerhard Weikum and Martin Theobald, Max Planck Institute for Informativs, Saarbruecken, Germany

Abstract

There are major trends to advance the functionality of search engines to a more expressive semantic level. This is enabled by the advent of knowledge-sharing communities such as Wikipedia and the progress in automatically extracting entities and relationships from semistructured as well as natural-language Web sources. Recent endeavors of this kind include DBpedia, EntityCube, KnowItAll, ReadTheWeb, and our own YAGO-NAGA project (and others). The goal is to automatically construct and maintain a comprehensive knowledge base of facts about named entities, their semantic classes, and their mutual relations as well as temporal contexts, with high precision and high recall. This tutorial discusses state-of-the-art methods, research opportunities, and open challenges along this avenue of knowledge harvesting. It also addresses issues of querying the knowledge base and ranking answers.

Invited Tutorial 2

Information Complexity

T.S. Jayram, IBM Almaden Research Center

Abstract

The recent years have witnessed the overwhelming success of algorithms that operate on massive data. Several computing paradigms have been proposed for massive data set algorithms such as data streams, sketching, sampling etc. and understanding their limitations is a fundamental theoretical challenge. In this survey, we describe the information complexity paradigm that has proved successful in obtaining tight lower bounds for several well-known problems. Information complexity quantifies the amount of information about the inputs that must be necessarily propagated by any algorithm in solving a problem. We describe the key ideas of this paradigm, and highlight the beautiful interplay of techniques arising from diverse areas such as information theory, statistics and geometry.

Credits