Draft Spec mark 2

Sat Sep 2 08:13:57 CEST 2000

>>>From Eric's comments:

| Kevin has offered us not just 1, but 4, different document models for
| comment. That's a lot to consider, and I won't attempt to offer detailed
| criticism on each alternative. But from my point of view, the first
several
| appear far too "weak" to be of much use.

These are four (very early) examples, not four models. My intention is to
develop a
specification that can allow data represented with varying levels of
complexity, from very simple statements to fully worked up integral data
files such as Lucid and DELTA use. I'm not sure why, but I like the idea
that the most basic document would simply say "Eric Zurcher has two feet".
Perhaps current programs wouldn't be able to make much use of such a
document, but maybe in the future we may have a way of collecting and
organising clouds of such basic scraps (atoms) of description into something
quite
powerful.

Yes, I agree. Validation rules need to be built in. I think I said that I
was putting this up for development, not as a finished product.

Perhaps another reason for having character & taxon lists at the head of the
document is to allow an agent that visits the document to determine which
taxa and characters it's about without having to parse the whole thing?

This is perhaps a subtle point, perhaps a trivial one, and I'm glad you
raised it. Why is this better that the above? What exactly is the advantage?
I know yours is more DELTA-like and I agree it seems more intuitively
correct, but is there more to your strong belief than that? In truth, I
played with different ways of structuring unitary statements and presented
the one I did to draw comment rather than just follow down the path of what
we do already.

| And I'd be inclined to make a bit more use of attributes, rather than
| element content, though as Bob Morris points out, that is largely (though
| not entirely) a matter of stylistic convention.

See my general comment about true- and pseudo-XML

| I don't really see any great difficulties is implementing some sort of
| "character list" within the XML syntax. "Document 4" appears to be heading
| in that direction, but doesn't (in my opinion) go quite far enough.

I agree

| What I think we need is a Schema definition that would allow a validating
parser
| to detect obvious data errors, and assist editing software in enforcing
| "correctness" of the data. I've recently been looking through the
| description of XML Schema, and it seems to have the expressive power
needed
| for this sort of thing.

Very likely. My opinion is that we should work out our data requirements
(one of which is the need for validation as you properly point out) and then
determine the best vehicle to carry the data.

Cheers - k