characters/states

Sat Jul 29 23:44:41 CEST 2000

At 10:23 AM 25/7/00 -0500, Peter Stevens & Stinger commented on the hoary
issue of characters/states vs measurements:

It's completely true that a set of measurements is a more fundamental data
element than a character state. But we need to allow for both ways of
recording data, for two reasons. Firstly, some characters don't lend
themselves to measurement (e.g. indumentum - it would be possible but
hellishly difficult). Secondly, if I'm writing a key to Families, I'm
afraid I probably won't be recording measurements of individual specimens
very often, as the data blowout would be alarming. And what specimens would
I use? I see no way around the problem (for higher-order taxa at least) of
doing the collation of data (measurements->states) in my head, with all the
admitted problems that ensue.

The draft standard document I've put to the group allows for some
characters to be scored after a head-wise (hopefully wise, anyway)
collation, others to be collated from individual measurements using some
form of collation rule.

As to data attribution, the draft standard document allows (I hope) all
data elements to be attributed (to a specimen, taxon, literature citation,
person etc), but doesn't force the issue.

>Kevin: >Or can we have a set of linked standards - one for describing in
>the boring
> >old characters/states/taxa way, and others for the more space-shuttle ways
> >that can be linked in as they develop.
>
>and Eric >it is rather difficult to
> >combine datasets based on differing character lists, even when those
> >character lists are fairly similar. There is no mechanism for "mapping"
> >character states from one dataset onto those of another.
>
>We are back to an issue that was raised earlier, so at the risk of boring
>people by repetition, the way to record data is not (if at all possible)
>as states, but as measurements.  Measurements of the same character from
>different studies may then be combinable, and states extracted from the
>measurements.  But if one is combining states, then one is likely to be in
>trouble.  The "approved shapes" that a committee of the Systematics
>Association came up with some years back are fine for communication in
>descriptions and conversations, but the best way to record shape data,
>whether for phylogenetic analysis or deciding if one heap of specimens
>really is distinct from another, is as measurements.
>
>(I agree that a list of "approved" characters is probably not attainable
>at present.)
>
>This raises another issue that Eric mentioned:
>
> >Another aspect
> >might be the inclusion of meta-data (e.g., who recorded this observation,
> >and on what material?) While this sort of information can be placed into
> >DELTA "text" characters and comments (as could virtually anything
> >representable as a string of bytes, with a bit of effort), the use of text
> >characters does not provide for well-structured access to the information.
>
>This issue of metadata is very important: if a DNA sequence is linked to a
>voucher, then all measurements/observations should be linked to
>specimens.  In an "ideal" database, there would be links to specimens, if
>only through a literature citation; not being able to make easy links in
>the literature can make it difficult or impossible to use, and not having
>such links in a database greatly reduces its value.  If I go wrong in
>including a specimen in a particular taxon, I want to be able to pull out
>all information associated with this specimen, whether it is simply length
>of leaf blade or how the anther wall develops, when I put it in its
>cotrrect place.
>
>Peter S.
></blockquote></x-html>