standard sloppy values (was) The end of DELTA?

Bryan Heidorn heidorn at ALEXIA.LIS.UIUC.EDU
Tue Nov 23 18:02:26 CET 1999


This came up in several different forms already, but any hierarchy should be
able to be terminated at any level without generating a syntax error.
Semantics
need to be defined based on the depth of the hierarchy in any particular
instance. This will help to accommodate the large body of semi-structured text
descriptions that already exist and were discussed earlier. For example, the
"Leaf" node might be defined to contain a list of optional properties (or
sub-nodes) such as size, shape... but the standard should allow for free text
description as well, perhaps by allowing for a <NaturalLangage> as one of the
options, sort of an "other" category but with definable semantics. So a system
might treat the value of this field as a term vector in a full text retrieval
system or some other application we have not thought of yet.  It could coexist
with the other more restrictive nodes such as <size>. This would allow
partially successful parsers to "automatically" mark-up full text documents to
conform to the standard.

This use of parsers brings up a meta-data issue that came up at the TDWG
conference. Many data elements can have a certainly associated with them.
Certainly is of at least two types. One is the confidence that the data in the
node is actually correct. A good example might be programs that attempt to
register location based on verbal descriptions. This might only matter for
computer programs generating the data and programs reading the data but I can
imagine human uncertainty as well. For example an entomologist might know part
of an insects habitat or range leading to uncertainty in this data.

The second type of uncertainty is the type used in LucID for indicating how
well the characteristic classifies the organism being described for
identification purposes.

just 2-cents from an information scientist.

-- Bryan Heidorn
At 06:47 PM 11/23/99 -0500, you wrote:
>>Could one constrain further by expressing within the schema the hierarchical
>>relationships between the elements(eg 'blade' and 'petiole' as child
>>elements of leaf') or would the introduction of terminology into the
>>'standard' be a step too far?
>>
>
>Putting on my programmer hat here, those kinds of things should be about
>to be included in whatever XML standards were set up relatively transparent
>to the user.  For instance, "leaf" by default would contain certain
properties
>(size, shape, petiole, base, apex, number, arrangement, location) that
>the user could code or mark as non-existant withought having to worry
>aout the "parent/child" or other object relationship.  Folk who want to
>*write* software that uses the XML (or whatever) standard would be the
>ones to worry about those issues.
>
>Susan Farmer
>sfarmer at goldsword.com
>Botany Department, University of Tennessee
>http://www.goldsword.com/sfarmer/Trillium
>
--
--------------------------------------------------------------------
  P. Bryan Heidorn    Graduate School of Library and Information Science
  pheidorn at uiuc.edu   University of Illinois at Urbana-Champaign
  (V)217/ 244-7792    501 East Daniel St., Champaign, IL  61820-6212
  (F)217/ 244-3302    http://alexia.lis.uiuc.edu/~heidorn

"You've got to run just to stand still and if you want to get anywhere you
have to run twice as fast."  Red Queen in "Alice in Wonderland"




More information about the tdwg-content mailing list