This came up in several different forms already, but any hierarchy should be able to be terminated at any level without generating a syntax error. Semantics need to be defined based on the depth of the hierarchy in any particular instance. This will help to accommodate the large body of semi-structured text descriptions that already exist and were discussed earlier. For example, the "Leaf" node might be defined to contain a list of optional properties (or sub-nodes) such as size, shape... but the standard should allow for free text description as well, perhaps by allowing for a <NaturalLangage> as one of the options, sort of an "other" category but with definable semantics. So a system might treat the value of this field as a term vector in a full text retrieval system or some other application we have not thought of yet. It could coexist with the other more restrictive nodes such as <size>. This would allow partially successful parsers to "automatically" mark-up full text documents to conform to the standard.
This use of parsers brings up a meta-data issue that came up at the TDWG conference. Many data elements can have a certainly associated with them. Certainly is of at least two types. One is the confidence that the data in the node is actually correct. A good example might be programs that attempt to register location based on verbal descriptions. This might only matter for computer programs generating the data and programs reading the data but I can imagine human uncertainty as well. For example an entomologist might know part of an insects habitat or range leading to uncertainty in this data.
The second type of uncertainty is the type used in LucID for indicating how well the characteristic classifies the organism being described for identification purposes.
just 2-cents from an information scientist.
-- Bryan Heidorn At 06:47 PM 11/23/99 -0500, you wrote:
Could one constrain further by expressing within the schema the hierarchical relationships between the elements(eg 'blade' and 'petiole' as child elements of leaf') or would the introduction of terminology into the 'standard' be a step too far?
Putting on my programmer hat here, those kinds of things should be about to be included in whatever XML standards were set up relatively transparent to the user. For instance, "leaf" by default would contain certain
properties
(size, shape, petiole, base, apex, number, arrangement, location) that the user could code or mark as non-existant withought having to worry aout the "parent/child" or other object relationship. Folk who want to *write* software that uses the XML (or whatever) standard would be the ones to worry about those issues.
Susan Farmer sfarmer@goldsword.com Botany Department, University of Tennessee http://www.goldsword.com/sfarmer/Trillium
-- -------------------------------------------------------------------- P. Bryan Heidorn Graduate School of Library and Information Science pheidorn@uiuc.edu University of Illinois at Urbana-Champaign (V)217/ 244-7792 501 East Daniel St., Champaign, IL 61820-6212 (F)217/ 244-3302 http://alexia.lis.uiuc.edu/~heidorn
"You've got to run just to stand still and if you want to get anywhere you have to run twice as fast." Red Queen in "Alice in Wonderland"
participants (1)
-
Bryan Heidorn