(GEN) Schema independence

Gregor Hagedorn G.Hagedorn at BBA.DE
Mon Nov 29 18:09:49 CET 1999


> >    * I propose to have 3 XML Namespaces for our different XML
> >      vocabularies:
> >         o biological descriptions (generalities)
> >         o botany
> >         o zoology
>
> I'm sure you don't mean to leave out Mycology, Bryology, etc.

Exactly (says a mycologist). Also, what about at least 20 different
name spaces for protists, 50 or so for bacteria. Viruses, mycoplasms,
viroids, ...

Biological diversity is well beyond our current grasp. We should
admit that we do NOT understand it in any way sufficiently. Even what
we understand is obscured by terminoly inadequacies or concurrent,
but not entirely synonymous terminology. From my own experience, I
think that unifying character definition even among fairly closely
related groups of a small group of the fungi can be a challenging
task.

Richard Pankhurst could probably step in here and explain why it is
so difficult to even draw up a core consensus character set for all
higher plants (an effort he is concerned with for several years now).

The tendency in this discussion to assume that things like leaf,
petiole, veins etc. should be directly coded into the standard
worries me. The examples are fine for the purpose of discussion, but
I hope that most people agree that the schema independence of DELTA
(and principally also NEXUS, although NEXUS leaves most of the
definition to free text in a published article) is a big achievement,
and I would not like to sacrifice it.

This does not mean that we should not try to unify character schemas,
but it definitely should be a separate task. We need a general
standard, and an applied standard that acutally uses a defined
namespace.

So, I believe we need:

o Character schema definition language, expressed in XML-DTD or
  better XML-Schema
   - Researchers working on weird stuff like Tardigrada should not be
     required to wait until somebody creates a schema for them, but
     must go ahead and develop their own schema.
   - DELTA is, in part, such a language.

o Character schema (an applied character definition, that acutally
  defines names, type, constraints, etc. for a set of characters),
  expressed in the new Character schema definition language.
   - The character definition of a DELTA data set is such a schema.
   - The thing to achieve would be, to express this in such a way
     that it becomes a usable XML Namespace definition, while still
     linking into the Character schema definition language! I could
     then be directly used by XML tools, but it would be developed
     with added value by the more relevant additional definition in
     the Character schema definition language!
o An application to express item descriptions using these tools.
  - Example: DELTA ITEM DESCRIPTION.

Thus I envision something like:

+ Character definition language,
    adding special metainformation to:
+ Standard XML Namespace definition,
    which, however, can be used independently

Both together form a character schema of a new standard, which would
be used in item descriptions. The item descriptions themselves have
two components:

+ A markup language to mark existing (or computer generated) free
  textual descriptions
+ A data language, to exactly define the data used, including
  knowledge management support

See also the separate post "Markup of text descriptions Vs.
structured data (GEN)"

Gregor
----------------------------------------------------------
Inst. for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Gregor Hagedorn                 Net: G.Hagedorn at bba.de
Koenigin-Luise-Str. 19          Tel: +49-30-8304-2220
14195 Berlin, Germany           Fax: +49-30-8304-2203

Often wrong but never in doubt!




More information about the tdwg-content mailing list