Re: characters/states and measurements and other hoary problems
While I am philosophically most sympathetic to Peter's agruements, I must agree with Kevin with regard to the need to distinguish measurements as "characters" from qualitative characters.
The bases of comparison for the former are typically fundamentally different with respect to asignment to state of the latter, and would be dependent not only on the exact definition [criteria for recognition] of the endpoints, which may differ systematically between investigators (or method of measurement), but also with respect to temporal scope (ie some endpoints may not be identifiable, except within a particular size range for a given structure in a given taxon). As pointed out by Kevin, they would also be dependent on sampling, although one would hope that most investigtors are atuned to the need to increase sample sizes to avoid bias or mischaracterization. With respect to measurements, the bases of comparison are geometrically constrained, whereas qualitative descriptions are not so constrained, even though both may be used to characterize the "same" features.
Nonetheless, one could in some cases meaningfully tag measurements with some of the same tags as qualitative characters to indicate that they are proxies for the "same" features. This would allow data matrices conforming to these "generalized" features to be associated and the user able to then assess whether the association is "sufficiently tight" to warrant regarding them as proxies for the "same" feature. The difficulty comes in consistently applying/defining the basis of comparison and in establishing criteria to recognize when constraints imposed on using measurements create inconsistency with various qualitative definitions of "state". Perhaps attributes of a tag element could be used to indicate (hint at) different shades of meaning when measurements are associated with qualitative states (eg <leaf character><measurement character endpoint1="regionally localized, endpoint1_precision=">>5%" endpoint2 ="precisely localized" endpoint2precision="<<1%"> 45.2</measurement character></leaf character>, or some such construction, where in this case, <leaf character> may take the same tag name as might the qualitive character. However, we may need to also include additional refining elements <data measured by>Peter Stevens</data measured by>Peter Stevens</data measured by><data measured using>calipers</data measured using>, or some such to fully relate fundamentally diffferent data types or qualify the measurement context. In any event, such shades of meaning would require that "appropriate" context be established and context could vary enormously depending on the features being evaluated. In image analysis, one of the reasons image algebra was developed was to assist in standard charcterization (transformation) of images and image fragments ("features"). However, this only works when there is a consistent basis for comparison (typically pixel size and pixel neighborhood) that underlies the sets being defined. Unfortunately, (or perhaps fortunately) the human vision system is vastly more variable and complex.
In the near term, Kevin is probably correct to suggest that the onus is on those who would care to tackle the "hellish difficulties" of making such associations. It may be best to fork the dialog with regard to this point, which is not to say that we shouldn't seek a general approach that could be extended to include possible solutions to such "issues of association" at a later time (ie better establish a useful lexicon and grammar for <mesurement character> elements, as opposed to <qualitative character> elements).
Nonetheless, one of the unfortunate practices in systematics has been the relegation of measurement data to "notebooks" or more recently "floppy/CD disks" that are not properly archived in any coordinate fashion, except perhaps for the duration of a study or some unspecified time after that, and often at best only partially summarized in the original publication. The discipline could be much, much richer, if there were a consistent standard for archival and association between measurement data and qualitative description, if for no other reason than to recognize the various contexts within which these data might be useful in the future, not to mention to encourage workers to let others know "exactly" what they did to arrive at their conclusions. Measurment data can be extremely useful in providing insight into how different investigators actually recognize the various states of their characters and may suggest why different investigators may not recognize the same states for the "same" characters.
Kevin Thiele wrote:
At 10:23 AM 25/7/00 -0500, Peter Stevens & Stinger commented on the hoary issue of characters/states vs measurements:
It's completely true that a set of measurements is a more fundamental data element than a character state. But we need to allow for both ways of recording data, for two reasons. Firstly, some characters don't lend themselves to measurement (e.g. indumentum - it would be possible but hellishly difficult). Secondly, if I'm writing a key to Families, I'm afraid I probably won't be recording measurements of individual specimens very often, as the data blowout would be alarming. And what specimens would I use? I see no way around the problem (for higher-order taxa at least) of doing the collation of data (measurements->states) in my head, with all the admitted problems that ensue.
The draft standard document I've put to the group allows for some characters to be scored after a head-wise (hopefully wise, anyway) collation, others to be collated from individual measurements using some form of collation rule.
As to data attribution, the draft standard document allows (I hope) all data elements to be attributed (to a specimen, taxon, literature citation, person etc), but doesn't force the issue.
Kevin: >Or can we have a set of linked standards - one for describing in the boring
old characters/states/taxa way, and others for the more space-shuttle ways that can be linked in as they develop.
and Eric >it is rather difficult to
combine datasets based on differing character lists, even when those character lists are fairly similar. There is no mechanism for "mapping" character states from one dataset onto those of another.
We are back to an issue that was raised earlier, so at the risk of boring people by repetition, the way to record data is not (if at all possible) as states, but as measurements. Measurements of the same character from different studies may then be combinable, and states extracted from the measurements. But if one is combining states, then one is likely to be in trouble. The "approved shapes" that a committee of the Systematics Association came up with some years back are fine for communication in descriptions and conversations, but the best way to record shape data, whether for phylogenetic analysis or deciding if one heap of specimens really is distinct from another, is as measurements.
(I agree that a list of "approved" characters is probably not attainable at present.)
This raises another issue that Eric mentioned:
Another aspect might be the inclusion of meta-data (e.g., who recorded this observation, and on what material?) While this sort of information can be placed into DELTA "text" characters and comments (as could virtually anything representable as a string of bytes, with a bit of effort), the use of text characters does not provide for well-structured access to the information.
This issue of metadata is very important: if a DNA sequence is linked to a voucher, then all measurements/observations should be linked to specimens. In an "ideal" database, there would be links to specimens, if only through a literature citation; not being able to make easy links in the literature can make it difficult or impossible to use, and not having such links in a database greatly reduces its value. If I go wrong in including a specimen in a particular taxon, I want to be able to pull out all information associated with this specimen, whether it is simply length of leaf blade or how the anther wall develops, when I put it in its cotrrect place.
Peter S.
</blockquote></x-html>
participants (1)
-
Stuart G. Poss