It's How the Data will be Used that Counts

Tue Dec 4 15:14:10 CET 2001

Perfect agreement with Bob?  Maybe he's starting to rub off on me!

>For this reason, it is probably unnecessary to represent the state
>`present' at all, provided the semantics could reasonably require that
>a feature which is absent is never described. Is that a reasonable

I think this is probably true.  I can only come up with two potential

(1) In this case the state "present" is implied but this may not always be
the case when states are not specified.   For example:

<character name="distribution">
   <character name="country">

A default state of "present" doesn't make much sense for the character
"distribution".  This may not be a serious problem but we should give it
some thought.

(2) I think the bigger problem is the flip-side: if a character isn't in the
description then it must be "absent".  We can't do it this way.

>To me, the main thing that this kind of model implies is the need, in
>some cases, to provide a thesaurus, e.g. to provide advice that if a
>character (here `margin') has a subcharacter `teeth' then it may be
>described as `serrate'. Is that bad?  Or would a purely textual
>description which just said "Leaf margins with forward-pointing teeth"
>be deemed wrong absent the word "serrate" ?

Adding a thesaurus would add way too much work and removing "serrate" would
be too restrictive.

>I think natural language parsing understanding is harder than natural
>language production from structure. So I think there is less work to
>go from data to description than the other way around.

If true (and I think it is) then this would suggest a more DELTA-like
structure and less of a text-markup structure for the representation.


