From Eric
| I don't mean to be petty, but didn't we pretty much agree on a different | sort of nesting of elements in this general model?
I thought we were still trying to work things out. What was it we agreed on...
| Anyway, I can see that we will need to be cautious with terminology here. | "Element" (and "Attribute") have specific meanings within XML that could | potentially be confused with the terms as used for descriptive data. | Similarly, "valid" has a specific meaning within XML (basically, that a | document is both well-formed and compliant with a specified DTD or Schema), | that is slightly different from the notion of "validity" you are referring | to here (or is it?). I realize that in this context, none of us has much | difficulty using the appropriate mental "namespace", but if we ever reach | the stage of a formal number, we'll need to be careful of this. (Yes, yes, | I know I'm being terribly pedantic, and I'm sorry. But then I am addressing | taxonomists, after all.)
Yes, I know. Can you suggest better terms and we'll use them.
| >It contains information in a structured way that | >means something, and the information can be parsed. Sure, this document | >couldn't be used direct as input to Lucid or DELTA, but does that make it | >worthless? I'm working at the moment towards a future generation of the | >Lucid Builder (and perhaps future DELTA Builders will be similar) that will | >indeed be able to collect and interactively integrate such structured bits | >of information. | | Such a document is not "worthless", but it is definitely "worth less" than | it could be. Arguably, the chief goal in designing a descriptive data model | is to provide a format or structure which can be readily accessed or | manipulated by computer. We humans can usually cope fairly well with | conventional natural language descriptions, but we can do so because of our | substantial background knowledge of life, the universe, and everything. We | can know from a quick glance at the above what is intended (provided we can | read English!). But today's computers have no concept of "Smell", and has | no awareness that the term "Smell" in this description may be in any way | related to the term "fragrance" is some other description. In short, such a | "loosely" structured description has very little advantage over a | description written in natural language in terms of machine readability, | yet suffers some reduction in its human readability.
I agree that having a DELTA-like degree of structure is a worthy goal, and I'm not excluding such a degree of structure from the proposed standard, just not requiring it as I keep saying. What I don't agree is that it's such an obvious gold standard that we should say "for a description to be regarded as a useful description, it MUST have this degree of structure".
I suppose I keep coming back to this: DELTA has just the degree of structure you wish for, it's been around for what, 30 years?, it's even been given the imprimatur as the "World Standard" already and - without putting too fine a point on it and calling a spade a bloody shovel - it hasn't worked as a standard. Now I suppose I'm going to get howls of protest over this, particularly from people who have been using DELTA for years, but I assay that it's true. The only reason that this discussion is taking place at all is that TDWG has realised for some time that DELTA as a standard hasn't worked well, and we need to be open about this.
One solution is to do the fixes to DELTA that have been proposed already (and have been in draft now for ten years or so) and XMLify it. But it seems to me that the main limitation of DELTA isn't simply that it isn't XML, or that it's not complex enough (heaven forbid), or some easily-fixable thing like that. It's more fundamental, and I suppose I'm exploring some sort of fundament at the moment. I think part of the basic problem is that it's tried to force too much structure and while this is a great promise it's been an impediment in practice. I may well be wrong about this, but I think there's something in it.
| If you can actually create software that can make real sense of bits of | information like this, you will have accomplished something that has eluded | AI researchers for the past half century.
I appreciate the irony, but I think that creating software that can handle the degree of structure I'm proposing (which is actually considerable, when you think about it) is a lot different from creating an AI that can precis Hamlet.
| This discussion brings to mind a debate between the mid 20th century | American poets Robert Frost and Carl Sandberg. Frost argued that writing | poetry in free verse was like playing tennis without a net. Sandberg | countered that tennis would be a better game if there were no net to get in | the way. As one might guess, I've always been inclined to agree with Frost...
Perhaps I'm trying to play lawn tennis with a table-tennis net! Would that be fun? It'd certainly make it easier to get the bloody ball over the net so the game can go on...
Cheers - k