Welcome to DiSC 2002
SIGMOD 2001
PODS 2001
 SIGMOD RECORD 2001
CIKM 2001
CoopIS 2001
DASFAA 2001
DASFAA 2000
DBPL 2001
Data Engineering Bul
DEXA_EC-WEB 2001
DMKD 2001
 DPDJ 2001
HYPERTEXT 2001
ICDE 2001
ICDM 2001
ICDT 2001
JCDL 2001
KDD 2001
 KDD_EXPLORATIONS 20
KRDB 2001
MDM 2001
MIR 2001
MIS 2001
RIDE 2001
 = RIDE'01 Website
 = Invited Talks
<<< = RIDE'01 papers>>>
SBBD 2001
 SIGIR 2001
 SIGIR FORUM 2001
SSDBM 2001
SSTD 2001
TODS 2001
TIME 2001
VLDB 2001
VLDBJ 2001

Copy-Based versus Edit-Based Version Management Schemes for Structured Documents


Shu-Yao Chien, Vassilis J. Tsotras, and Carlo Zaniolo

  View Paper (PDF)  

Return to XML Document Versioning and Change Management


Abstract

Managing multiple versions of XML documents and semistructured data represents a problem of growing interest. Traditional version control methods, such as RCS, use edit scripts representing changes in the document to support the incremental reconstruction of different versions. The edit-based approaches have been enhanced with a replication scheme called UBCC (Chien et al., 2000). UBCC is based on the notion of page usefulness and ensures effective management for multi-version documents in terms of both retrieval and storage cost. These improvements notwithstanding, the edit-based representation suffers from limited generality and flexibility-e.g., it cannot represent changes such as rearranging the document or duplicating parts of its content. To solve these problems, the paper proposes a copy-based UBCC versioning scheme, which also provides a simpler format for the electronic exchange of multi-version documents. With the objective of matching the performance of the edit-based UBCC technique, we develop algorithms that enhance the copy-based UBCC scheme with page usefulness management. We also present results of various experiments that test the storage and retrieval performance of the new copy-based approach, and compare it with that of the edit-based UBCC approach.


DiSC'02 © 2003 Association for Computing Machinery