Rich, Kevin as one of the main supporter of the taxonID the final decision what stable piece of information it reflects to me sits with the publisher/author of the data. The problem Kevin describes is very common and comes down to the difficulty to describe what information really is stable in taxonomy. The name may change, the classification, the textual description, the distribution and its probably impossible to automatically tell whether anything significant has changed or if only small "corrections" have been done. Even for humans its a challenge to compare several textual descriptions and decide whether its the same thing or not. I guess this is where some people see the conceptID to come into play to give some long term stability - but I still dont see much gain in another ID if you cannot tell what pieces of the information behind that ID is stable over time. I still believe with scientificNameID and taxonID we are well equipped to deal with our data. I tend to think someone who publishes the data should have a good thought about how they assign and change taxonIDs and try to announce what they consider stable, what is versioned or what can change anytime without changing the meaning, i.e. the ID. But I would assume this might be different for different databases focussing on different aspects of taxonomy in particular.
For purely automatically aggregated data in Checklist Bank I decided to assign stable taxonIDs to entire lexical group of names that at least have one qualified name, i.e. with proper authorship. The classification, the spelling and authorship of the preferred, representive name for that group might change - but at least you have some definition of stability which is really hard to define. For lexical name groups that only contain bare canonical names I am assigning volatile IDs right now, as these might dramatically change with more knowledge about them. I dont know if this could also apply to Kevins problem - do you treat the same, qualified name in multiple concepts? If thats the case then surely you need something else to define a stable taxonID.
Markus
On Jul 5, 2010, at 9:26, Richard Pyle wrote:
This is why I'm very uncormfortable with the entire notion of "taxonID". The main reason I'm pushing so hard for taxonNameUsageID's (ala GNUB) is that these are the "atoms" (as Dave R. calls them) of both nomenclature *and* most existing concept definitions. If we can get permanent and widely shared/re-used IDs on these "atoms", then we can assmble the complex molecules from them. Someone's notion of a taxon concept then becomes a set of TNUID's. I have mixed feelings about branding these sets with permanent GUIDs; but if we did, this is what I imagine taxonID in DwC would (ultimately) represent. If we want to archive the sets for posterity, then we can certainly brand them with IDs. But I tend to think these can instead by dynamic services, that assemble the sets either algorithmically, or through the fingertips of experts.
So...I guess before we do anything, we need to get a common sense for what is intended to be represented by taxonID. I suspect my own view is not shared by all (or even most).
Rich
From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Kevin Richards Sent: Sunday, July 04, 2010 5:44 PM To: tdwg-content@lists.tdwg.org Subject: [tdwg-content] Taxon Concept dilemma
Hello all,
I have an issue that I would like some comment on…
We have some data that covers Taxa, Names and Concept relationships. Eg
A Taxon table that contains the nomenclatural details + accepted name + parent name
Concept + relationship tables that contain details about the name + references where the name has been used in a taxonomic sense (ie not nomenclatural information) – this is specifically a link between the Name and a Reference
We have fairly permanent Ids for the Taxon Name (nomenclatural) and the Concepts, but I now what to consider the ID to cover the whole Taxon (ie the Nomenclatural data + taxon rank + parent name + accepted name, etc, as “we” understand them). (Probably equivalent to the taxonID in Dwc)
The problem is this tends to be much more dynamic data – ie, in this particular case we have aggregated data from a variety of providers and are in continual revision of this data - as we revise the data the details such as the accepted name may change – this troubles me a bit, because this could be seen as fundamentally changing the definition of the object behind the taxonID. However, I suspect this is a common case that people find themselves in – ie revision/tidying of aggregated datasets must be quite common.
I would prefer to NOT change the taxonID every time we revise that data (taking the angle that these changes are corrections, so are not changing the object itself). Should it be OK to have an object type like this, that is likely to change, but keep the ID permanent for it – ie accept that some object types are quite dynamic?
The only other option is to maintain a hideous version audit trail, that probably hinders the use of the data more than it benefits the end user by providing “stability”.
Any thoughts?
Kevin
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content