From: Kevin Thiele kevin.thiele@PI.CSIRO.AU To: TDWG - Structure of Descriptive Data TDWG-SDD@USOBI.ORG
Create a key to all grass species so you're working with a list of all taxa at species level including all the Poa species. The character list has two classes of characters - ones that are scored over all taxa (these will be the easily generalised characters) and ones that are scored for only a subset of taxa (the characters that are highly specific and/or not easily generalisable). When the key program starts it splashes up the generalised characters only. But if after answering some characters you end up with only Poas, the program finds and adds to the character list the Poa-specific characters. Characters are progressively revealed as you proceed through the key, with as much depth as necessary - e.g. you may come down to a species complex of alpine Poas and presto! some characters appear that are just the ticket to separate them.
Something like this, but less rigid, is achieved automatically by algorithms for finding 'best' characters. A 'best' algorithm typically has a penalty for characters which are unknown, inapplicable, or variable for some taxa, but it does not completely exclude them. The 'best' algorithm used in our programs Intkey (interactive identification) and Key (generation of conventional keys) has a natural penalty for such characters, arising from the goal of minimizing the average length of an identification. The algorithm also has a parameter, Varywt, which can be used to add an arbitrary penalty for such characters. However, this tends to increase the average length of an identification, so its use in Intkey is not recommended. A value of 0 for Varywt would have an effect similar to that proposed above by Kevin.
If you try this in Intkey, note that a value of 0 is treated as 0.01, an ad hoc adjustment made specifically to _avoid_ the complete exclusion of the characters. In a data set with a substantial number of missing or inapplicable values (e.g. the sample data supplied with the DELTA programs), it is easy to observe how low values of Varywt cause characters with comparatively low separating power (and which therefore result in longer identifications) to move towards the top of the 'best' list.
Varywt is primarily intended for use in Key, where the increased average length of an identification is offset by a reduction in the _printed_ length of the key. In Key, a value of 0 completely prevents the use of characters with intra-taxon variability.
--
Mike Dallwitz
CSIRO Entomology, GPO Box 1700, Canberra ACT 2601, Australia Phone: +61 2 6246 4075 Fax: +61 2 6246 4000 Email: md@ento.csiro.au Internet: biodiversity.uno.edu/delta/
participants (1)
-
Mike Dallwitz