Hi Gregor,
Thanks for this. :-)
What I try to say about taxon concept management, is that we usually don't have the biological knowledge to document the circumscription of a taxon in an objective way. There are trivial and easy cases, like two morphoforms, long recognized, well understood, that are now understood to be truly only one species with varieties. In practice, this is, in my opinion, rare. More likely is that a new species is now detected. However, now it depends on the phrasing and structure of the identification tools previously used, whether earlier identifications according to a given publications would have included the new species or not. That is, in the light of a new species, every publication that aids identification defines its own taxon concept.
Yes, and unfortunately the are written in a way that makes it difficult to determine which specimens are instances of this concept and which are not. In part, because they are interpreted differently be different taxonomists.
I am rather certain that biologists don't have the resources to investigate this in its full breadth. However, in some important cases, where necessary or particularly interesting, these resources have been found, and opinions about the concept mapping have been expressed. To me the function of managing taxon concepts is to be able to manage this, while still allowing to reason (with greater margin of error) on the "default assumption" that - unless otherwise known - most taxon concepts for most purposes can be "roughly" set equivalent.
This is why I was commenting on the attempt to introduce a single class of taxon-concept URIs for all living things, without contemplating the mismatch of resources to actually document and verify them.
My intent was to have a set of taxonconcept URI's but this does not prevent someone from creating other ones that are broader or narrower.
It would be best if these were done in a way that was compatible so that you could make the following statements.
Some of the existing "concept" like things that I link to using skos:closeMatch are not really the same "kind of thing".
But if someone followed the same general form but had their concept entail a slightly different set of individuals you could do the following.
<*txn:*SpeciesA> <*vocab:*sensu_lato_To> <*bio:*SpeciesY*>* * * In this sense the *txn* concepts and "*bio*" concept are the same kind of species concept but they entail different set of individual organisms. * * Within each of these namespaces there should be little overlap between concepts.
By that I mean very few individuals would fall within two or more concepts.
But a bio concept could map to multiple txn concepts (sensu lato) or the other way around.
So within a given name space like txn, or bio the concepts are relatively clean and separate, but they don't have to be clean between namespaces.
Also it is not obvious in their current form, but the txn concepts will eventually provide some information that helps determine what individuals are instances of that concept and what individuals are not instances of that concept.
Some one identifying specimens could use this to determine which species concept is most appropriate for their specimen.
They should be able to say that their specimen is a instance of a txn:SpeciesX *and* an instance of bio:SpeciesY.
In the same way that specimen might be and instance of both a sensu lato concept and a sensu stricto concept.
This will take a while :-)
Respectively,
- Pete
* * * *
Are interpreted as being statements about the same "thing"
What you describe is the goal!
An interesting question I have is whether I can distinguish reasoning that was made based on those cases where taxon concepts have been investigated, from those based on default nomenclatural equivalence (i.e. homotypic synonyms)?
Also, I think I am missing timelines. If I make the default assumption: "names match" for the last 20 years, I am likely to get good results. If I do this for the last 200 years, I probably end up with garbage only.
One think I have been thinking of: provided I know when a new treatment has been published which is now generally accepted, assuming that it takes 10 years to spread (plenty of counter-examples to that...), then such a timeframe might be used to qualify the quality of inferences. Perhaps...
Also, the quality of inferences on species having been recently lumped versus species has been split is different. In the first case, pretty naive heterotypic synonymy works quite reliably, in the latter case only details (and usually unavailable) study of taxon concept works.
I think the present work of listing the accepted names with their present heterotypic synonymy in CoL will be a big progress. I wonder whether these data can be used to identify validity time periods and information about splitting or lumping changes?
I have no ready solutions to these problems and appreciate all who tackle it!
Gregor