Welcome to D
SIGMOD'00
 = SIGMOD'00 We
 = Plenary Talk
<<< = SIGMOD'00 Pa>>>
PODS'00
SIGMOD Recor
CIKM 2000/CI
COMAD 2000
Data Enginee
DL 2000
DPDJ
EDBT 2000
Hypertext 20
ICDE 2000
KDD 2000
KDD Explorat
KRDB 2000
SBBD 2000
SIGIR 2000
SIGIR Forum
SSDBM 2000
TODS
VLDB'00
VLDBJ

XMILL: An Efficient Compressor for XML Data


Hartmut Liefke and Dan Suciu

  View Paper (PDF)  

Return to Research Sessions


Abstract

We describe a tool for compressing XML data, with applications in data exchange and archiving, which usually achieves about twice the compression ratio of gzip at roughly the same speed. The compressor, called XMill, incorporates and combines existing compressors in order to apply them to heterogeneous XML data: it uses zlib, the library function for gzip, a collection of datatype specific compressors for simple data types, and, possibly, user defined compressors for application specific data types.


References


Note: References link to DBLP on the Web.

[1]
...
[2]
...
[3]
...
[4]
...
[5]
Roy Goldman , Jennifer Widom : DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. VLDB 1997 : 436-445
[6]
Jonathan Goldstein , Raghu Ramakrishnan , Uri Shaft : Compressing Relations and Indexes. ICDE 1998 : 370-379
[7]
Stéphane Grumbach , Fariza Tahi : A New Challenge for Compression Algorithms: Genetic Sequences. Information Processing and Management 30(6) : 875-886(1994)
[8]
...
[9]
John E. Hopcroft , Jeffrey D. Ullman : Introduction to Automata Theory, Languages and Computation. Addison-Wesley 1979, ISBN 0-201-02888-X
[10]
Balakrishna R. Iyer , David Wilhite : Data Compression Support in Databases. VLDB 1994 : 695-704
[11]
Hartmut Liefke , Dan Suciu : An Extensible Compressor for XML Data. SIGMOD Record 29(1) : 57-62(2000)
[12]
...
[13]
Mitchell P. Marcus , Beatrice Santorini , Mary Ann Marcinkiewicz : Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19(2) : 313-330(1994)
[14]
Svetlozar Nestorov , Serge Abiteboul , Rajeev Motwani : Infering Structure in Semistructured Data. SIGMOD Record 26(4) : 39-43(1997)
[15]
Wee Keong Ng , Chinya V. Ravishankar : Block-Oriented Compression Techniques for Large Statistical Databases. TKDE 9(2) : 314-328(1997)
[16]
Mark A. Roth , Scott J. Van Horn : Database Compression. SIGMOD Record 22(3) : 31-39(1993)
[17]
...
[18]
...
[19]
...
[20]
Jacob Ziv , Abraham Lempel : A Universal Algorithm for Sequential Data Compression. IEEE Transactions on Information Theory 23(3) : 337-343(1977)

Referenced by

  1. Hartmut Liefke , Dan Suciu : An Extensible Compressor for XML Data. SIGMOD Record 29(1) : 57-62(2000)

BIBTEX


@inproceedings{DBLP:conf/sigmod/LiefkeS00,
  author    = {Hartmut Liefke and
                Dan Suciu},
   editor    = {Weidong Chen and
                Jeffrey F. Naughton and
                Philip A. Bernstein},
   title     = {XMILL: An Efficient Compressor for XML Data},
   booktitle = {Proceedings of the 2000 ACM SIGMOD International Conference on
                Management of Data, May 16-18, 2000, Dallas, Texas, USA},
   journal   = {SIGMOD Record},
   publisher = {ACM},
   volume    = {29},
   number    = {2},
   year      = {2000},
   isbn      = {1-58113-218-2},
   pages     = {153-164},
   crossref  = {DBLP:conf/sigmod/2000},
   bibsource = {DBLP, http://dblp.uni-trier.de} } },




DiSC'01 Copyright ©2002 ACM Inc.