Re: Minimalism AND functionalism
At 09:31 5/09/2000 +1000, Kevin Thiele wrote:
So I'm proposing, as ground-zero, that the only absolute requirements are that a document have DESCRIPTION elements, that these have ELEMENT elements (terminology is unfortunate here) and that these have VALUEs. This is a long way down the track towards minimalism, I know, but I'm floating it for comment.
So, as I've exampled before, I'd like something like:
<DOCUMENT ID = "d2" Name = "One thing I know about a rose"> <DESCRIPTION> <ITEM> <ITEM_NAME> Rose </ITEM NAME> </ITEM> <ELEMENT> <ELEMENT_NAME> Smell </ELEMENT_NAME> </ELEMENT> <VALUE> Sweet </VALUE> </DESCRIPTION> </DOCUMENT>
to be a valid document.
I don't mean to be petty, but didn't we pretty much agree on a different sort of nesting of elements in this general model?
Anyway, I can see that we will need to be cautious with terminology here. "Element" (and "Attribute") have specific meanings within XML that could potentially be confused with the terms as used for descriptive data. Similarly, "valid" has a specific meaning within XML (basically, that a document is both well-formed and compliant with a specified DTD or Schema), that is slightly different from the notion of "validity" you are referring to here (or is it?). I realize that in this context, none of us has much difficulty using the appropriate mental "namespace", but if we ever reach the stage of a formal number, we'll need to be careful of this. (Yes, yes, I know I'm being terribly pedantic, and I'm sorry. But then I am addressing taxonomists, after all.)
It contains information in a structured way that means something, and the information can be parsed. Sure, this document couldn't be used direct as input to Lucid or DELTA, but does that make it worthless? I'm working at the moment towards a future generation of the Lucid Builder (and perhaps future DELTA Builders will be similar) that will indeed be able to collect and interactively integrate such structured bits of information.
Such a document is not "worthless", but it is definitely "worth less" than it could be. Arguably, the chief goal in designing a descriptive data model is to provide a format or structure which can be readily accessed or manipulated by computer. We humans can usually cope fairly well with conventional natural language descriptions, but we can do so because of our substantial background knowledge of life, the universe, and everything. We can know from a quick glance at the above what is intended (provided we can read English!). But today's computers have no concept of "Smell", and has no awareness that the term "Smell" in this description may be in any way related to the term "fragrance" is some other description. In short, such a "loosely" structured description has very little advantage over a description written in natural language in terms of machine readability, yet suffers some reduction in its human readability.
If you can actually create software that can make real sense of bits of information like this, you will have accomplished something that has eluded AI researchers for the past half century.
Of course it's important to be able to store loosely structured bits of information, but in general the more rigour we can apply, the better.
This discussion brings to mind a debate between the mid 20th century American poets Robert Frost and Carl Sandberg. Frost argued that writing poetry in free verse was like playing tennis without a net. Sandberg countered that tennis would be a better game if there were no net to get in the way. As one might guess, I've always been inclined to agree with Frost...
Eric Zurcher CSIRO Division of Entomology Canberra, Australia E-mail: ericz@ento.csiro.au
participants (1)
-
Eric Zurcher