Compression of XML files

18 Dec 1999

      People interested in XML file size will be interested in a new free
program XMill from AT&T Research Labs and UPenn. XMill is said to
compress files up to twice as much as gzip on proprietary formats when
data in those formats is first rendered in XML and then compressed
with XMill. This is essentially because XMill can use the tag
information to reorganize the data for greater redundancy.

http://www.research.att.com/sw/tools/xmill/

points to the software and technical material about it.

One of the authors, Dan Suciu, is also the a co-author of "Data on the
Web : From Relations to Semistructured Data and XML", Morgan Kaufman.
This is great book whose title is accurate. To read it, you have to be
comfortable with graph theory, regular expressions, and finite state
automata. If you are not, you can accept this a summary: XML is more
general than relational databases and a little less general than pure
object oriented databases.

Compression of XML files

Robert A. (Bob) Morris