DELTA cf SDD

Mike Dallwitz md at ENTO.CSIRO.AU
Wed Sep 6 09:50:52 CEST 2000


> From: Kevin Thiele <kevin.thiele at PI.CSIRO.AU>
> To: TDWG-SDD at USOBI.ORG

>> Example:
>>
>>   5,3/3&1<when infected with virus>/<occasionally>2<@reliability 3>
>>
>> = Flowers red, or red and white (when infected with virus), or
>>   occasionally yellow.

> There are various types of things here that are masquerading as being
> similar, when they're not at all. "<when infected by a virus>" is perhaps
> a true comment. Note that these comments are only used for natural
> language descriptions (they're stripped out when DELTA translates to an
> interactive key etc). "<occasionally>" is a qualifier as defined in the
> draft spec.

The things are similar in that they qualify state values or attributes. The
distinction between free-text and coded qualifiers is also made in the
proposed DELTA enhancements (see
http://biodiversity.uno.edu/delta/standard/proposal.exe). The coded
qualifiers in the DELTA proposals are distinguished by being preceded by the
symbol '@'. The above example could have been written
    5,3/3&1<when infected with virus>/<@rarely>2<@reliability 3>
The coded qualifiers could be checked for validity by programs, and could be
used by programs for various purposes (including natural-language
descriptions). Although the primary purpose of free-text qualifiers is for
natural-language descriptions, they could also be used for other purposes
(e.g. they could be searched). Their omission from the current version of
Intkey is only for historical reasons (and is also irrelevant to the present
discussion).

> Lucid is currently the only program that uses this type of qualifier [like
> @rarely] in identification, and it's very important. <@reliability 3> is a
> command for a program. This should be made clear by the tag so that other
> programs can ignore it.

Of course, _any_ tag can be ignored or used by a program. I see no reason to
distinguish, in the standard, between those tags that are used by the
current version of LucID, and those that are not. Character and attribute
reliabilities (which are not implemented in the current LucID) are arguably
more important for identification than the '@rarely' qualifier.

> I've always been bothered by the use of & (and) vs / (or) in DELTA. Using
> Mike's example:
>     5,3/3&1 = "Flowers red, or red and white"
> The use of & is handy for codifying natural-language descriptions ...

'nuff said. The connectors 'and' and 'to' were put in DELTA for this
purpose, in response to user demand. Omitting them from a new standard would
be a retrograde step.

> ... but in other applications it's highly problematical. This is a
> homology problem ... The red & white broken colouring of virus-infected
> tulips is actually non-homologous with uniform red or uniform white and
> should be a separate character (or at least a separate state).

Let's not confuse empirical facts with principles. 'red and white' _could_
be a homologous, intermediate condition between 'red' and 'white', in the
same way that 'red to white' could be (cf. colour change in hydrangeas). You
have tacitly admitted this by suggesting that it could be a separate state.

In multi-purpose databases, you often have to make compromises in the way
characters are defined, so that they will work reasonable well for all
purposes. The possibilities in this example might be

#1. flowers <colour>/
    1. red/
    2. white/

#2. flowers <colour>/
    1. entirely deep red/
    2. red and white/
    3. entirely white/

#3. entirely red flowers <presence>/
    1. present/
    2. absent/

#4. red and white flowers <presence>/
    1. present/
    2. absent/

#5. entirely white flowers <presence>
    1. present/
    2. absent/

Character 1 is simple, produces the best natural-language descriptions, and
is often acceptable for other applications. However, one of the other
formulations would probably have to be used (depending on the capabilities
of the software) if the distinction between 'red and white' and the other
conditions was important for identification or classification. The
natural-language descriptions would suffer, particularly if characters 3-5
were used.

It's impossible to cater for the connector 'to' by redefining characters in
this way.

--

Mike Dallwitz

CSIRO Entomology, GPO Box 1700, Canberra ACT 2601, Australia
Phone: +61 2 6246 4075   Fax: +61 2 6246 4000
Email: md at ento.csiro.au  Internet: biodiversity.uno.edu/delta/




More information about the tdwg-content mailing list