Minimalism AND functionalism

Sat Sep 9 10:22:53 CEST 2000

>At 09:01 8/09/2000 +1000, Kevin Thiele wrote:
>
>>So tell me, does *anyone* out there agree with me?
>>
>
>Well, I don't thing we're really all that far apart. As Jim Croft nicely
>summarized it:
>
>"Is this the consensus we are arriving at?  That we strive for structure and
>comparability but that we accommodate the free text 'blob' because
>oftentimes it may be the best we can get? Yes? Great!"
>
>So I'm largely in agreement with you, Kevin, with the disagreement perhaps
>being on where we should aim as our primary target on the
>structure/unstructred spectrum. (That is, I'd prefer to see highly
>structure information as the "default", with loosely structured "blobs"
>being optional, rather than the other way round.)

Yes, I think that this should be the primary focus as well -- both as a
programmer and a botanist.

>A similar sort of problem must occur commonly with specimen databases. For
>example, how to you capture a "location"? You might prefer to have lats,
>longs, and elevations nicely geocoded to the nearest meter, but the label
>on a older specimen might not say much more than "in a shady little
>billabong along Reedy Creek, back of Beyond". It's desirable to be able to
>store that information one way or another, even when it doesn't fit into
>your preferred structure, but used structured data when you can get it.

And this is something which I am currently *painfully* aware of.  I'm a
member of the ATBI plants group (the Biodiversity Inventory of the Great
Smoky Mountains National Park ...  qv. www.discoverlife.org or
www.goldsword.com/sfarmer/ATBI for our taxon pages ... they're supposed
to get crosslinked *eventually* ... ) and one of the things that we're
doing at the moment is databasing all of our legacy data -- all the
herbarium sheets that we have.

Unfortunately, for starters, that location
data is all going into a big blob *initially* so that it can be analysed
and determined what fields we have and which ones that we really want
to use -- we're also working with the folks who are designing the Final
Database that we're going to use because we have the most legacy data
already in databases.

And this might be something to consider --
your legacy label says something like "adjacent to Reedy Creek." and
that's all.  So, you go back later, and *can* add additional information
because that population is still there -- perhaps it's the only one
along the creek -- how do you add that information into the existing
record and identify it as *added* data?  It's valid -- some of it
more-so than others, perhaps.  Maybe all you might be able to add is
the County -- or a gross locality name -- "Greenbrier Section" but
you want to identify it as added data.  Would that be something that
this group would want to consider?  I know that it goes back to
the Validation issues ...

>
>As a programmer, I'd like to be able to know whether I'm looking at a
>rigourously defined object or something "fuzzier". I think one could fairly
>easily accomodate both. Modifying one of Kevin's examples only slightly,
>you could have something like:
>
><DOCUMENT>
>    <DESCRIPTION Taxon_Name = "Viola odorata">
>        <CHARACTER type="defined" Character_Name = "Leaves">
>            <STATE State_Name = "present">
>        </CHARACTER>
>        <CHARACTER type="arbitrary" Character_Name = "scent">
>           a marvelous perfume on a perfect spring day
>        </CHARACTER>
>    </DESCRIPTION>
></DOCUMENT>
>
>where there is some small distinction made between rigourously defined
>characters (those which can be validated against some "character list",
>sensu lato, and for which cross-taxa comparisons are clearly meaningful),
>and characters defined "on the fly". (Note: I'm not so sure that using
>attributes is syntactically the best way to do this, but it illustrates the
>principle.)
>
>Of course, as Mike would point out, such characters are really not terribly
>different from DELTA "text" characters...
>
>I suspect that how all this eventually gets used will depend substantially
>on the sorts of editing and markup tools that are developed. I don't really
>anticipate that many taxonomists will want to go through their existing
>natural language descriptions and insert <ELEMENT></ELEMENT> tags manually.
>It's not only tedious, it's too darned easy to make mistakes. And perhaps
>the editing tools can be made clever enough that maintaining a character
>list isn't all that difficult. But it's still a good idea to have a system
>flexible enough to catch material that doesn't fit the mold.
>
>So we do agree, kind of...
>
>Eric Zurcher
>CSIRO Division of Entomology
>Canberra, Australia
>E-mail: ericz at ento.csiro.au
>

Susan Farmer
-----

Susan Farmer
sfarmer at goldsword.com
Botany Department, University of Tennessee
http://www.goldsword.com/sfarmer/Trillium