It's How the Data will be Used that Counts

Steve at Steve at
Tue Dec 4 15:14:10 CET 2001


Perfect agreement with Bob?  Maybe he's starting to rub off on me!

>For this reason, it is probably unnecessary to represent the state
>`present' at all, provided the semantics could reasonably require that
>a feature which is absent is never described. Is that a reasonable
>requirement?

I think this is probably true.  I can only come up with two potential
problems:

(1) In this case the state "present" is implied but this may not always be
the case when states are not specified.   For example:

<character name="distribution">
   <character name="country">
      <state>Australia</state>
  </character>
</character>

A default state of "present" doesn't make much sense for the character
"distribution".  This may not be a serious problem but we should give it
some thought.

(2) I think the bigger problem is the flip-side: if a character isn't in the
description then it must be "absent".  We can't do it this way.


>To me, the main thing that this kind of model implies is the need, in
>some cases, to provide a thesaurus, e.g. to provide advice that if a
>character (here `margin') has a subcharacter `teeth' then it may be
>described as `serrate'. Is that bad?  Or would a purely textual
>description which just said "Leaf margins with forward-pointing teeth"
>be deemed wrong absent the word "serrate" ?

Adding a thesaurus would add way too much work and removing "serrate" would
be too restrictive.


>I think natural language parsing understanding is harder than natural
>language production from structure. So I think there is less work to
>go from data to description than the other way around.

If true (and I think it is) then this would suggest a more DELTA-like
structure and less of a text-markup structure for the representation.

Steve




More information about the tdwg-content mailing list