This raises the old argument for scoring _data_ in our databases
rather
than information derived from those data. This is to say that your
system should at least allow of scoring descriptive data at specimen
level rather than at the conceptual level of species (or other taxa).
Thus the descriptions of taxa could be derived "just in time", ie on
the
fly when required. Redetermination of a given suite of specimens
would
result in all the relevant descriptions, keys, and other products
being,
in effect, dynamic. This will be vital for projects which are
institutional or international in scope.
<bold>I think that if databases are to be anything other than local, this seems to be the issue</bold>. One can in theory at least go from measurement to shape terms and still keep the basic data intact. At the risk of overkill, below are some important (I think) points to be borne in mind. Note that these measurements and images I talkk about (sorry, this is edited from some notes which may turn into a paper or a lecture for next term's class) are really the metadata that ground the data of the description of an animal or plant or the character states of a phylogenetic analysis. Although the comments below are couched in terms of phylogenetic analysis, the metadata assembled would be useful for a variety of purposes.
<fontfamily><param>Times</param><bigger><bigger><bigger>Access to individual observations linked to individual specimens is the ideal towards which we should strive. Photographs, drawings, images in general and measurements as well as bibliographic references are the proper metadata of morphological-phylogenetic data; they ground those data.
Having such metadata readily accessible has several advantages.
1. We should be able to revisit earlier decisions that there were states by examining the observations available to the authors who made those decisions.
2. The same character may be quite properly divided in different ways for different taxa within a larger group. When studying the larger group, we need to combine all the raw data from which the states were separately abstracted in the smaller studies. Combining raw data may destroy states, looking at a subset of the raw data may suggest additional states.
3. There is no knowing how future morphological observations will relate to a gap that currently separates two states (this is unlike the normal situation for molecular data). Similarly, data based on misidentified specimens or taxa assigned to the wrong higher taxon should be in a form that allows them to be easily placed in their correct contexts, at the same time evaluating any effects this may have on state delimitation.
4. Gap coding is not universally accepted, and there are at present no sound reasons for preferring one method of coding over another. There are few studies that code the same variation in different ways and compare the results. Absent such reasons, studies and comparisons, metadata will enable observations to be used by a variety of coding practices.
5. If the states scored for a taxon are those believed to be basal in it (Nandi et al. 1998; Doyle and Endress 2000), details of the variation within the taxon need to be accessible. This is particularly necessary given that our knowledge of taxa presumed to be "basal" in a lineage and so sometimes (but largely incorrectly) emphasized when assigning states is in a state of flux.
The general direction we should go is clear. Genbank is more reliable than Vernon Heywood's shoebox in which we often place our data (Heywood, 19). The shoebox may be discarded when we move, and our heirs and assignees may well consign it to the recycling pile or bonfire on our deaths. Unless our observations can be stored somewhere, phylogenetic hypotheses will not cumulate. Information on each character should be stored as individual measurements and images as far as is possible, and linked to specimens and place of publication. If we are serious about morphology, we need an equivalent to GenBank, a MorphBank.
</bigger></bigger></bigger></fontfamily>