Leigh Dodds writes:
I too would argue that what we seek is a system that distinguishes between "observations", measured in some sense (either directly using some device), or perhaps "mapped" using known or widely used conventions
and
higher level abstractions that often pass as qualitative "data", (ie leaf "ovate" or state "relatively derived" or "1").
Would it be too simplistic to simply allow the format to express the source (qualitative/quantitative, or a more detailed description) or a particular data set?
Too simplistic, I think. There will surely be many cases where in one data set some characters will be "real" (i.e. low-level) observations while others will be abstractions. For instance, sometimes it will be viewed as worthwhile for the taxonomist to capture real data (leaf shapes are fairly easy to do), other times anything other than abstraction is practically impossible (try working out how to capture indumentum characteristics quantitatively).
More on the entity/property/value model. Don wrote:
Can character definitions be constrained without making too tight a straight-jacket for oneself? Might it be possible to represent taxon character descriptors by for example an entity/property/value (eg. leaf/shape/ovate) based schema? This on the face of it would seem to map onto an XML element/attribute/value schema pretty well. Would that help define more closely how we construct characters and maybe even prove universally applicable for all character types?
Could one constrain further by expressing within the schema the hierarchical relationships between the elements(eg 'blade' and 'petiole' as child elements of leaf') or would the introduction of terminology into the 'standard' be a step too far?
This worries me. Is the entity/property/value schema restricted to 3 levels? (ditto for XMLs element/attribute/value). If so, it seems too much of a straightjacket, especially if we want to express a hierarchy of elements like this. What to do about leaf/lamina/abaxial_surface/vein_islands/indumentum/density. I agree that a hierarchically structured schema may be wonderful, but in a biological context I wouldn't like to see it constrained at 3 levels just because that's what existing computer formats do.
Cheers - k