Types of data

Thu Nov 25 14:10:21 CET 1999

Aren't observations made by looking at specimens and measurements taken from
'organs' (or whatever constituent parts of specimen)?

The taxonomic philosophers might well argue that all characters are
abstractions, that the concept that a leaf has shape is an abstraction. In
terms of the 'reality' of statement you might make about the shape of a
leaf, it doesn't really matter whether you describe shape using standard
terms or as set of coordinates or as an elliptic fourier coefficient. Surely
the way you measure an attribute depends on what you are going to use the
measurements for, ultimately that's a question of precision. If there is a
possibility that there will be a variety of uses (reuse) for the data, then
an indictation of the degree of precision of the measurement would be highly
desirable.

Isn't it also important to distinguish whether the description refers to an
actual leaf (ie. that leaf there, on that specimen is ovate) or whether you
are making a generalisation about a taxon (ie leaves ovate to obovate)? A
taxon cannot have a leaf shape, only leaves can, and an actual leaf does not
usually vary in shape except through time.

So if the reality is;

"I looked at two specimens; from specimen x I measured 3 leaves and their
shapes were ovate, ovate and obovate; from specimen y I measured 2 leaves,
the first was ovate, the second obovate"

Then why not say so?

<my_taxon_description>
   <specimen>
      <leaf shape="ovate"/>
      <leaf shape="ovate"/>
      <leaf shape="obovate"/>
   </specimen>
   <specimen>
      <leaf shape="ovate"/>
      <leaf shape="obovate"/>
   </specimen>
</my_taxon_description>

For a generalised statement about my taxon, all I need is an aggregate
function for the listed values of leaf shape.

It seems that very few 'real' observations are formally recorded but we have
a lot of generalisations.

"leaves ovate to obovate, 5--31 mm long"

What can be deduced about the relationship between leaf shape and leaf
length?
At what point in the exercise or recording observations should the decision
be made that this information either is or is not important?

<my_taxon_description>
   <specimen>
      <leaf shape="ovate" length_mm="10"/>
      <leaf shape="ovate" length_mm="8"/>
      <leaf shape="obovate" length_mm="25"/>
   </specimen>
   <specimen>
      <leaf shape="ovate" length_mm="5"/>
      <leaf shape="obovate" length_mm="31"/>
   </specimen>
</my_taxon_description>

Maybe the ovate leaves are all immature, or there's some enivironmental
factor. Maybe I don't want to know, but maybe I do.
don

> Leigh Dodds writes:
>
> >> I too would argue that what we seek is a system that
> distinguishes between
> >> "observations", measured in some sense (either directly using
> >> some device), or perhaps "mapped" using known or widely used
> conventions
> >and
> >> higher level abstractions that often pass as qualitative
> "data", (ie leaf
> >> "ovate" or state "relatively derived" or "1").
> >
> >Would it be too simplistic to simply allow the format to express
> >the source (qualitative/quantitative, or a more detailed description)
> >or a particular data set?

Kevin Thiele wrote;

> Too simplistic, I think. There will surely be many cases where in one data
> set some characters will be "real" (i.e. low-level) observations while
> others will be abstractions. For instance, sometimes it will be viewed as
> worthwhile for the taxonomist to capture real data (leaf shapes are fairly
> easy to do), other times anything other than abstraction is practically
> impossible (try working out how to capture indumentum characteristics
> quantitatively).