Character hierarchies

Kevin Thiele kevin.thiele at BIGPOND.COM
Fri Nov 30 12:48:10 CET 2001


----- Original Message -----
From: "Jim Croft" <jrc at ANBG.GOV.AU>
To: <TDWG-SDD at USOBI.ORG>
Sent: Friday, November 30, 2001 1:03 AM
Subject: Re: Document vs. database

| Also, in our new model we also want to show more than just presence
| or absence of a state.  Don't we want to show if it is present/rarely
| present/present by misinterpretation/rarely present by
| misisnterpretation, etc., and my favourites yet to be implemented as
| a character stae attribute: definitely absent, absent by
| misinterpretation, unknown, unscored

Of course these will come in. In my example, see:

<Feature Name="Longevity"><Value>persistent</Value>(rarely<Value
Qualifier="rarely">deciduous</Value>)</Feature>

The other qualifiers (misinterps etc) aren't in there yet, but will come.
We'll have to be much more sophisticated that in my example, of course.

Now how about "definately known to be unknown"

| > I'm simply suggesting allowing n levels:
| > <Leaves>
| >     <shape>
| >         <ovate>
| >         <elliptic>
| > <Flowers>
| >     <colour>
| >         <blue>
| >         <red>
|
| That is where we want to get to do, but echoing Bob's words, what we
| are after is the structure that allows us to get to that, giving
| people all the appearance of freedom to do what they like but not
| actually doing it and imposing a schematic straight jacket on the data.
| A schema for the data structure rather than the descriptive data itself.
| Like all freedoms, the descriptive data one too must be an illusion or
| we wil miss out on on all the creative potential that it offers as the
| free spirits among us do theri own thing.  But that is another
| discussion...  :)

Sorry, I made a blue with the example. It should have been:

 <Feature name = "Leaves">
    <Feature name = "shape">
         <Value>ovate</Value>
         <Value>elliptic</Value>
    </Feature>
</Feature>
 <Feature name = "Flowers">
     <Feature name = "colour">
         <Value>blue</Value>
         <Value>red</Value>
    </Feature>
</Feature>

| There has been a lot of talk about the feature/value paradigm and how
| this might be made to represent a biological description and even nested
| features in such descriptions...
|
| At the recent TDWG meeting Richard Pankhurst described something like a
| feature/character/value paradigm and at the time I made a note that this
| was probalably worth considering in more detail, but so far have not had
| the time.
|
| It probably would mean something like:
|
| <feature name="leaf" charecter="margin" value="serrate"/> or some

But I can't see that "margin" is a thing different in kind from "leaf".
Sure, the margin is part of a leaf but equally the leaf is part of the
plant, so if you have feature|character|value then you will end up with
"leaf" as a feature in one context and a character in another - messy,
surely. My example above just has Feature|value. The difference is that a
feature is a thing but a value is not (both "leaf" and "margin" are things
that can be pointed at, "ovate" and "blue" are not). In the XML tree
representation above features are nodes and values are twigs (or leaves if
you like - of course, a "twig" would be a node, not a twig (or a leaf) - oh
dear, can someone invent a language that isn't a language!)

Ditto for Jim's second example:

|<feature name="leaf">
|  <character name="margin">
|    <observation value="serrate" modification="obscurely" state="present"
|     basis="specimen" reference="Stevens in NGF67432"/>
|    <observation ..... />
|    <observation ..... />
|  </character>
|</feature>

Try interpolating another level in here (e.g. marginal teeth shape) and you
get into strife. Whereas with

<feature name="leaf">
  <feature name="margin">
    <value value="serrate" modification="obscurely" state="present"
     basis="specimen" reference="Stevens in NGF67432"/>
    <observation ..... />
    <observation ..... />
  </feature>
</feature>

We can surely add as many levels as we like.

Or can we - here's a challenge - find the simplest way of expanding
description 1: "Leaf margin serrate" into description 2: "Leaf margin
serrate with forward-pointing teeth".

Attempt 1.
"Leaf margin serrate"
<feature name="leaf">
  <feature name="margin">
     <value>serrate</value>
  </feature>
</feature>

"Leaf margins serrate with forward-pointing teeth"
<feature name="leaf">
  <feature name="margin">
    <feature name = "teething shape">
        <value>serrate</value>
    </feature>
    <feature name = "teeth orientation">with
        <value>forward-pointing</value>teeth
    </feature>
  </feature>
</feature>

This requires moving <value="serrate"/> from the node <feature
name="margin"> to a new interpolated node <feature name = "teething shape">,
so as to keep a rule that <value>s cannot have <feature>s as siblings. An
alternative:

Attempt 2.

"Leaf margins serrate with forward-pointing teeth"
<feature name="leaf">
  <feature name="margin">
    <value>serrate</value>
    <feature name = "teeth orientation">
        <value>forward-pointing</value>
    </feature>
  </feature>
</feature>

breaks this rule. Attempt 2 seems preferable to me in terms of efficiency,
but the rule seems somehow important.

Of course, restricting ourselves to a 2-level heirarchy a la Lucid and DELTA
makes these problems go away, but only because we artificially collapse the
levels down and lose the structure:

"Leaf margins serrate with forward-pointing teeth"
<feature name="leaf margin teething">
    <value>serrate</value>
</feature>
<feature name = "leaf margin teeth orientation">
        <value>forward-pointing</value>
</feature>

I'm getting lost here - can someone help?  We've +/- discussed this before,
but we need more thinking around real examples such as these.

Cheers - k




More information about the tdwg-content mailing list