> As an attempt to start a discussion, I am posting this to ask: do we all
> really agree that there should be a new standard for descriptive data
> based on XML, as a substitute for DELTA (as well as NEXUS and XDF)? Or
> should we instead just try to improve one of the existing formats?
>>>From a purely technological perspective I'm convinced that XML is an
excellent mechanism for describing structured data - which is what
much of DELTA is all about.
Standardising on something like XML means that custom parsers/parsing
routines can be eliminated, making it much easier to write software
to manipulate the information. I've already run over these issues
in my web page on XDELTA.
> As of myself, when I first read the specifications of "XDELTA", I was
under
> the impression that in some way the DELTA format as we know it would
> then become obsolete... But what of NEXUS, or XDF? Has anyone considered
> of the integration of these formats plus DELTA into a single new,
> XML-based, format for descriptive data?
:) I didn't mean to imply that, just that using XML would provide a
better underlying data format.
The reason I've been holding fire on additional work is that I believe that
a proper requirements analysis needs to be done to decide what data needs
to be captured by the standard - does DELTA meet all the current
requirements
of taxonomists? Are there other standards (e.g. NEXUS, etc) which are better
suited to some applications/data? Can these be integrated into a single
new standard (or modular standard)?
As a computer scientist rather than a taxonomist I don't feel qualified
to address these issues, as I'm not working with this data on a daily
basis.
Perhaps we should start by building a list of requirements and then
measuring DELTA, NEXUS, etc against them to see whether they meet
them. If they do - then theres no reason for change, if they don't
then it may be time for a new standard. XML is only one possible
'syntax' for such a standard, but its one which has a lot of
software support.
Note that when I refer to DELTA I refer to the data/file format
specifically and not the application software. These should be
evaluated separately.
So, what do people see as the basic requirements for this kind
of format?
- ease of use (i.e. authoring)
- ease of processing (parsing, validating, reading, converting)
- ease of sharing (i.e. distribution)
- open-ness (i.e. proprietary/non-proprietary)
- ease of extensibility (i.e. ability to add more information cleanly at a
later
data)
- internationalization
- un-abiguity of data representation
- unlimited size of data sets? (i.e. any limitation on character names,
lengths, item names, numbers, etc)
What types of data need to be modelled by the format? (this can
be a post-requirements gathering step, but some consideration needs
to be given early on to measure the 'success' of the current
formats)
- what types of characters? (real, integer, text, etc)
- what types of data (text, images, other formats?)
Just some thoughts.
L.