SIGMOD Accepted Tutorials
Below is a list of the tutorials accepted for
the ACM SIGMOD 2003 conference, to be held as part of the Federated Computing
Research Conference (FCRC), in San Diego, California, USA, June 10-12, 2003.
The final titles and author lists are subject to change and will be posted after
the Camera Ready copy due in mid March, 2003.
Tutorials:
Tutorial 1
|
Chair: Surajit Chaudhuri
|
 |
Data Quality and Data Cleaning: An Overview
|
Theodore Johnson and Tamraparni Dasu, AT&T Labs Research
|
Data quality is a serious concern in any data-driven enterprise,
often creating misleading findings during data mining, and causing
process disruptions in operational databases. The manifestations of
data quality problems can be very expensive - ``losing'' customers,
``misplacing'' billions of dollars worth of equipment, misallocated
resources due to glitched forecasts, and so on. Solving data quality
problems typically requires a very large investment of time and
energy -- often 80% to 90% of a data analysis project is spent in
making the data reliable enough that the results can be trusted.
In this tutorial, we present a multidisciplinary approach to data
quality problems. We start by discussing the meaning of data
quality and the sources of data quality problems. We show how these
problems can be addressed by a multi-disciplinary approach,
combining techniques from management science, statistics, database
research, and metadata management. Next, we present an updated
definition of data quality metrics, and illustrate their application
with a case study. We conclude with a survey of recent database
research that is relevant to data quality problems, and suggest
directions for future research.
|
 |
Tutorial 2
|
Chair (Part 1): Sihem Amer-Yahia
|
Chair (Part 2): Dan Suciu
|
 |
XQuery: A Query Language for XML
|
Don Chamberlin, IBM Almaden Research Center
|
XQuery is the XML query language currently under development in the
World Wide Web Consortium (W3C). XQuery specifications have been
published in a series of W3C working drafts, and several reference
implementations of the language are already available on the Web. If
successful, XQuery has the potential to be one of the most important
new computer languages to be introduced in several years. This
tutorial will provide an overview of the syntax and semantics of
XQuery, as well as insight into the principles that guided the design
of the language.
The speaker is Don Chamberlin, one of IBM's representatives on the XML
Query Working Group, and co-author of the Quilt language proposal that
influenced the basic design of XQuery.
|
 |
Tutorial 3
|
Chair: Alin Deutsch
|
 |
Data Grid Management Systems
|
Arun Jagatheesan and Arcot Rajasekar, San Diego Supercomputer Center
|
Data Grids are being built across the world as the next generation
data handling systems to manage peta-bytes of inter-organizational
data and storage space. A data grid (datagrid) is a logical name space
consisting of storage resources and digital entities that is created
by the cooperation of autonomous organizations and its users based on
the coordination of local and global policies. Data Grid Management
Systems (DGMSs) provide services for the confluence of organizations
and management of inter-organizational data and resources in the
datagrid.
The objective of the tutorial is to provide an introduction to the
opportunities and challenges of this emerging technology. Novices and
experts would benefit from this tutorial. The tutorial would cover
introduction, use-cases, design philosophies, architecture, research
issues, existing technologies and demonstrations. Hands on sessions
for the participants to use and feel the existing technologies could
be provided based on the availability of internet connections.
|
 |
|
|