Bob wrote:
The distinction we are presently thinking about is purely syntactic. It is whether the XML data representation is tree structured (by a non-linear tree) and without idref's or is linearly structured and with idrefs. In fact all trees can be flattened into lists if by nothing else than adding references to the mix.
But this would seem to be to be a pretty important distinction and something that would have to be decided one way or the other in the Recommendation that we are going to come up with. Or are we thinking of a Recommendation that would allow both? It would be nice to be able to say, here is the one true path, wouldn't it?
It's easy to be convinced of this: trees can be represented in the conventional linear memory supplied with most computers, so it is clearly possible (and not that hard). What we want to get right is not whether this can be done---since we know it can---but specifically whether XML's reference mechanisms will capture the requirements of the intended stakeholders in the enterprise. We hope it does, because there is a BIG advantage to using XML: there is an abundance of tools, protocols, databses and services available to those who adopt it.
Bob can you post an example of what a tree stucture represented linearly with idrefs _might_ look like?
For me, what is lurking under the covers here is this: I don't think that high quality human readability of the naked markup is that important. I think it is the job of application software to render markup palatable to humans if that is the problem, or to other software if that is the problem.
At the end of the day we unfortunately have to deal with humans, and taxonomic humans which are even worse, so readability of final output is going to be a *major* issue. The data will need to be in a form that appropriate tools can transmute into something as close to natural language as possible. Otherwise the punters will walk away...
We extracted "highly" and "maybe" from the original. This raises an interesting point. It may well be good to recommend a specific set of qualifiers. That allows more intelligent applications. For example, (and neglecting that "highly variable" and "frequently variable" are probably not acceptable as nearly equivalent ) it might be easier to guide identification software if "frequently" is uniformly used where appropriate, thus perhaps identifying some initially more (or less) important feature. However, (a).An author might not be completely satisfied with that qualifier and (b) an author may be satisfied but prefer something else in natural language discourse.
I was thinking the exact same thoughts and even started preparing a list of suggested qualifiers but quickly gave up because the list would end as infinite as individual human vaguery and indecision and preferences for particular turns of phrase. It started to look too much like the trying to standardize character lists exercise.
Lucid appeared to offer one - the by misinterpretation thing. Delta appeared to offered infinite - through the comment thing. Or are these conceptually different attributes? Perhaps it might be possible to come up with a manageable list of qualifiers? Qualifiers In Current Use?
To have any useful application you would have to come up with a limited list of recommended qualifiers, and a mountain of possible synonyms, or almost synonyms, beneath them. All too scary for me...
Probably this entails recommending a mechanism for specifying alternatives to the 'official' qualifier. Those mechanisms could guide applications that were aware of them. For example
<FEATURE_VALUE QUALIFIER="frequently" RENDER_QUALIFIER_AS="highly">
Why would you want to say this sort of thing in the XML dataset rather than the XML aware application? If your preferred qualifier is 'highly' rather than 'frequently', why wouldn't you just code the data that way in the first place? Surely the most appropriate place for these sort of translations is in the application that is designed to render data in a particular way for a particular purpose (as it the wont of those fascist editor types and other control freaks)?
I guess you might want to tell an application this if the only choice of qualifiers you have is this, in this instance, use this particular one. Sort of like converse of how style sheets can specify a font, but give the application a choice of what to use if that font can not be found.
Nope... still too scary...
jim