Kevin brought up the question of advantages of XML. Somewhat belated, because I have been away for 10 days, here are some comments, trying to summarize the discussion as I see it.
1. I agree entirely with Noel that the discussions about XML on this list are intended to _explore_ the advantages of XML, not to settle on it. XML is well defined to be used in the context of text documents, but my impression is that for data that contain more structure, the XML variants are right now being developed (for example XML-Schema, XML-RDF). So it will be good to wait a moment before making decisions. And there is much to do regarding the information model itself! I hope that we will have more discussions about this.
2. I think Leigh Dodds did a good job in listing the advantages in his XDelta document. Generally I think that using standards and standard tools is a good idea. However, most certainly, XML will not be an "ultimate standard", as Jean-Marc thinks it is. XML will be certainly replaced by improved mechanisms. My feeling is that it is sufficiently relevant a step that it will be supported in a backwards compatible way even in 50 years. Of course, nobody can predict this.
3. Regarding the issue of size of data set, verbosity, and data compression: Some comments, esp. when it comes to "using it on PalmTops etc." confuse the issue of an exchange format with a native data format. Esp. given that XSLT will make it very easy to convert data, any program will most likely have a native, usually binary format that is optimized for operation speed and/or file size. I think we should not consider size any factor of an _exchange_ format. This format should be as robust and compatible for the future as possible. Many of us are developers of programs and tend to underestimate the effort that goes into the development of data sets. Our programs will certainly be replaced by better ones, but if the data can be maintained, revised, and improved as it is done for centuries with printed publications, I believe we will make better progress in biodiversity research. The data sets are the gold, and they should be preserved in a way that remains readable in 50 years. Because of this, however, using compression algorithms and then write it on a CD may be a bad idea for long term storage. Imagine having to decompress something that was compressed with some program on a Apple II machine...
Gregor ---------------------------------------------------------- Gregor Hagedorn G.Hagedorn@bba.de Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Koenigin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
Often wrong but never in doubt!