From Structured Documents to Novel Query Facilities

Vassilis Christophides, Serge Abiteboul, Sophie Cluet, and Michel Scholl

This paper presented several long-lasting contributions, combining SGML-based document management with database technology:

  1. It presented a technique to map DTDs to DB schemata and to store SGML documents in a database in such a way, that the document structure is preserved and can be used for querying.
  2. It introduced paths and attributes of SGML as first-class citizens. This is the most interesting novelty, since it allows combining the querying of schema information and data in a homogeneous fashion and thereby navigating through SGML documents based on their structure as well as their values.
  3. As a practical consequence, information retrieval based on text patterns can be generalized to include document structure. This basically adds semantics to the query facilities of an Information System.
  4. The paper laid the formal foundations for query languages for semistructured data, which later in the form of XPath became the core of several such query languages.

Although the paper deals primarily with SGML, most of the ideas were carried over to XML and gained high significance there.