Special states

Wed Sep 3 12:55:06 CEST 2003

Kevin wrote:
> It strikes me that
> 1. in some ways these two suggestions are very similar (whether the
> thing goes into the taxon x character cell or is attached to it is
> immaterial from a data point of view). 2. the real difference is in
> where their values are defined. Gregor's "special states" are defined
> in the instance document, while Steve's flags are (I assume) defined
> in the schema.
> Does this make a difference?

Two differences I see:

a) The Brazil proposal allows the project designer to constrain which
"missing data indicators = special states = CodingStatus" are
available within a given character while scoring this character in an
item description.

This is why the "special states" occur indeed within a character
definition: They are not defined there, but by referring to the
global/shared definition level, they are _enabled_.

What I find nice is that this uses the same kind of mechanism the
inheritance of shared state sets uses: either inherit the full set
for a given character, or restrict it to certain states.

BUT: it is not necessary to do it like this. This is truly just my
private attempt to get some structural generalization into the model.

Also: I tried to raise the question of how important it is to be able
to enable/disable certain special states. I find it useful if the
list of special states has, say 5 options and you want to produce a
web interface, but I am uncertain whether it is worth it. If it is a
side-effect of having a structurally simple model, I welcome it, but
otherwise I can just as well live with "missing data indicators =
special states = CodingStatus" always enabled for all characters if
we can reach a consensus about that.

b) Second difference: special states do need a label/wording whenever
they are reported. To achieve multilinguality this wording must be
user-definable (i.e. by the designer of the project/the terminology).
To me this points into the direction of putting it into the xml
instance document, handling it like the state definitions. Again
other solutions are possible.

I am not fully happy with the way the Brazil model works, and it
obviously confuses people.

Perhaps one of the confusing things is that

1. SDD makes definitions
2. Project designer makes global/shared definitions
3. Project designer creates characters,
   referring to shared definitions (point 2 above),
   inheriting the properties of 0-several shared definitions, PLUS
   in the case of categorical states defines local states
   (but in the Brazil model not in the case of "statical parameters
   = global measures" or "special states")
4. The scientist using a designed project scores the
   descriptions based on the decisions of the designer
   of the terminology.

Because SDD is able to constrain things through the schema, this 4
layer model is acutally NOT represented in the Brazil model (and the
model could perhaps be made more understandable by changing that? I
really don't know!). Instead, SDD definitions like "Special states"
(or whatever better to call them) and "statistical parameters" are
treated like point 2, but the designer has only limited choice. The
xml schema requires the designer to have a given number of shared
definitions for special states and statistical parameters present in
that part.

However, this contraction allows the designer to add labels to these
required definition in new languages. I currently do not see a better
solution for this, but I welcome any suggestion!

Can anybody help me in making the explanations above or the questions
poised in "stepping back" more intelligable?

Gregor
----------------------------------------------------------
Gregor Hagedorn (G.Hagedorn at bba.de)
Institute for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Koenigin-Luise-Str. 19          Tel: +49-30-8304-2220
14195 Berlin, Germany           Fax: +49-30-8304-2203

Often wrong but never in doubt!