Re: characters/states

29 Jul 2000

      At 10:23 AM 25/7/00 -0500, Peter Stevens & Stinger commented on the hoary
issue of characters/states vs measurements:

It's completely true that a set of measurements is a more fundamental data
element than a character state. But we need to allow for both ways of
recording data, for two reasons. Firstly, some characters don't lend
themselves to measurement (e.g. indumentum - it would be possible but
hellishly difficult). Secondly, if I'm writing a key to Families, I'm
afraid I probably won't be recording measurements of individual specimens
very often, as the data blowout would be alarming. And what specimens would
I use? I see no way around the problem (for higher-order taxa at least) of
doing the collation of data (measurements->states) in my head, with all the
admitted problems that ensue.

The draft standard document I've put to the group allows for some
characters to be scored after a head-wise (hopefully wise, anyway)
collation, others to be collated from individual measurements using some
form of collation rule.

As to data attribution, the draft standard document allows (I hope) all
data elements to be attributed (to a specimen, taxon, literature citation,
person etc), but doesn't force the issue.
...
Kevin: >Or can we have a set of linked standards - one for describing in
the boring
...
old characters/states/taxa way, and others for the more space-shuttle ways
that can be linked in as they develop.
and Eric >it is rather difficult to
...
combine datasets based on differing character lists, even when those
character lists are fairly similar. There is no mechanism for "mapping"
character states from one dataset onto those of another.
We are back to an issue that was raised earlier, so at the risk of boring
people by repetition, the way to record data is not (if at all possible)
as states, but as measurements.  Measurements of the same character from
different studies may then be combinable, and states extracted from the
measurements.  But if one is combining states, then one is likely to be in
trouble.  The "approved shapes" that a committee of the Systematics
Association came up with some years back are fine for communication in
descriptions and conversations, but the best way to record shape data,
whether for phylogenetic analysis or deciding if one heap of specimens
really is distinct from another, is as measurements.
(I agree that a list of "approved" characters is probably not attainable
at present.)
This raises another issue that Eric mentioned:
...
Another aspect
might be the inclusion of meta-data (e.g., who recorded this observation,
and on what material?) While this sort of information can be placed into
DELTA "text" characters and comments (as could virtually anything
representable as a string of bytes, with a bit of effort), the use of text
characters does not provide for well-structured access to the information.
This issue of metadata is very important: if a DNA sequence is linked to a
voucher, then all measurements/observations should be linked to
specimens.  In an "ideal" database, there would be links to specimens, if
only through a literature citation; not being able to make easy links in
the literature can make it difficult or impossible to use, and not having
such links in a database greatly reduces its value.  If I go wrong in
including a specimen in a particular taxon, I want to be able to pull out
all information associated with this specimen, whether it is simply length
of leaf blade or how the anther wall develops, when I put it in its
cotrrect place.
Peter S.
</blockquote></x-html>

Kevin Thiele

tags

participants (1)