Kevin Thiele wrote:
An item in the DDS is like an old OTU - it may be anything from a specimen
to a species to a Kingdom.
...
The possibility of specimens being items comes from the ability to nest treatments.
e.g. Treatment A comprises measurements of a series of specimens of Imaginaria magica for some characters (e.g. hypopode length). Treatment B comprises as its items species of Imaginaria. Some characters (e.g. CAM metabolism) are scored directly in B while others (e.g. the range of hypopode lengths) are collated from A.
Data for the character hypopode length in B would then be attributed to treatment A, where each data element is attributed to measurements taken by me on a specimen. Redetermination of a specimen followed by rebuilding of the treatment tree would allow the redetermination to be incorporated all the way up.
Would this work, and does it address your query?
This seem plausible to me.
The only complication that comes to mind is the situation where one might have two inconsistent identifications of particular specimens. How would the redetermination rule work with respect to A if one has a Treatment C, say for hypopode width, for which the ID for (some?) of the same specimens in A is inconsistent with that used in treatement A?
If the inconsistency is based on soley a difference of combination (perhaps old synonym incorrectly used) one could make the redetermination automatically (using the nomenclaturally appropriate combination). Otherwise, I'm not sure what flags get raised since it would depend on the nature of the inconsistency. Perhaps just throw the data back to the user with an error message and let them make a decision might be the appropriate response? This could certainly aid in identifying mistaken identifications, but in any event there probably needs to be some kind of default logic once inconsistency in the data is located. It will be difficult to eliminate entirely.
Statement:
One file will comprise one treatment, the basic unit of which is one or more characters describing one or more taxa or individuals.
Surely the data level to which a character is attributed is determined by the context of the treatment:
I would agree. I think its a question of to what extent we want the system to be able to make (part of) the inference for us so that we can be aware of the "universe" of features out there for which there are relevant data. When the data are initially recorded they are most likely defined within a much narrower context. I'm still struggling with how we can determine the extent to which the rules might be constructed to determine (identify/establish) context.
Another collation rule may convert low-order numeric data into a higher-order multistate:
Could be lots of options here.