Noel Cross noelc at OEB.HARVARD.EDU
Sat Dec 4 13:34:38 CET 1999

On Sat, 4 Dec 1999, Jean-Marc Vanel wrote:

> And now the figures about the "verbosity" of XML:
> My XML example (a single plant species from the flora of China) compresses from
> 2500 bytes to 1090 (57%):
> http://jmvanel.free.fr/Samples/DisplayDescriptions/species_example.xml
> An Xdelta example (Lepidoptera) on Leigh Dodds' site
> compresses from 50500 bytes to 3600 (93%), because it has a large number of
> repeted tags.
> I used winzip.
> Now we see that we can have all plant species on a CD (650 Mb), even without
> compression:
> 2500 bytes * 250 000 species = 625 Mb
> And on a DVD you can have 10 times more! And also on Internet with the new HTTP
> 1.1 protocol, the files can be compressed during transmission. Still 2 more
> arguments: the verbose repeted tags are not repeted when the XML file has been
> parsed and is in memory; the last argument is financial: today you can buy a 10
> 000 Mega-bytes disk for 200 US$ , and it will decrease.
> So file size is NO problem for taxonomy with XML. You must understand that it's
> the small price to pay for extensibility and interoperability.

Agreed. In some cases large file size does present a problem though.

Example 1:  NaviKey is a Java applet that has to do the following:
 - download data
 - parse data
 - interact with user

If the data are uncompressed XML files, then download time is obviously a
major problem. If the XML is compressed, there is still the added time
involved in compressing/uncompressing data, as well as new memory

Example 2:  Say I want to run NaviKey on my Palm Pilot*.  The device has
limited resources and I would prefer to use very small files if at all

As you say, these problems will all go away someday.


-Noel Cross

*I only wish it were possible -- A few people have actually asked for this
capability, as it would be handy to use such as program in the field.

More information about the tdwg-content mailing list