(LEX) Global but non-universal lexicons

Gregor Hagedorn G.Hagedorn at BBA.DE
Mon Jan 17 17:05:15 CET 2000


> There are a number of issues:
> - how are character lists imported if a computer is not connected to
> the net?

It was already said that we need a local copy. I think this is quite
natural especially because I believe we will have proprietary
database formats the most applications will work on, and not directly
XML.

> - how are multiple imported character lists merged - which takes
> precedence (local or remote)

We need to be able to combine both character definition and item
description in a way that some characters may go across multiple item
subsets but also some character used only in certain items. If items
are then reduced to the level of taxa in a summarize/synthesize
operation, even more complex structures can result.

> - versioning of character lists

Of course, with major versions in which meaning is changed, minor
version in which only corrections and additions are made, and perhaps
subminor with corrections only.

> - what if a remote character list is (or becomes) invalid (related
> to versioning)

Globally unique state IDS may consist of a local ID (not the sequence
number), the name and version of the standard.

I believe the merging, updating, controlling business of versioned
character definition standards is one of the main things why we need
more structure in the character definition (see basic property
discussion etc.).

---

I believe we need:

Globally unique IDs for _states_ (perhaps for character and headings
as well, not sure). The ID should be independend of ID numbers like
in DELTA and DeltaAccess, that are simultaneously ordering numbers.

A local copy that refers to a published standard. I believe it would
be good if the future standard allows manual links e.g. to a standard
published in print.

I imagine an attribute in which the author of a character definition
acertains that in his or her opinion the meaning of this data item is
identical to a standard, even if the wording may have been slightly
changed.

The problem is that ANY character definition standard needs to be
partly changed locally. Certain attributes of the character
definition, esp. those that are language specific (character and
state wordings, including character and state specific number and
date formats) or refer to formatting and structuring (ParagraphLink,
SentenceLink, CommaLink for natural language, Sequential ordering
numbers, Heading, etc.) are highly likely to be in need for change.

It may be desirable to split the character definition into a logical
part, changes of which imply changes of contents, and a formatting
part, which is always used as template only, and can be freely
changed.

(This implies that the character name needs to be doubled/split into
a global development name, probably in English, and a name that is
displayed in local language on the user interface.)

Gregor

----------------------------------------------------------
Gregor Hagedorn                 G.Hagedorn at bba.de
Institute for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Koenigin-Luise-Str. 19          Tel: +49-30-8304-2220
14195 Berlin, Germany           Fax: +49-30-8304-2203

Often wrong but never in doubt!




More information about the tdwg-content mailing list