[tdwg-content] DwC taxonomic terms

"Markus Döring (GBIF)" mdoering at gbif.org
Wed Aug 26 10:28:08 CEST 2009

>> Stan,
>> Im not sure if I understand you correctly when saying we should
>> differentiate between describing taxa and using them for
>> identifications. Do you suggest to have a completely seperate set of
>> terms for the two purposes?
> Taxon(Concept)ID would be the same in both contexts, description and  
> identification.  My point was that TaxonNameID should NOT be  
> included in an identification.  Include the name, yes, along with  
> nameAuthor, date, accordingTo elements, but not the NameID.  They  
> are all just strings to be processed in determining what taxonomic  
> concept was intended. I think a TaxonNameID is useful only in  
> documenting the derivation of names (if it's useful at all).  That  
> could be an extreme view, but I tossed it out to see if other agreed  
> or not.
agreed. A name id to me only is good to link to a richer definition at  
nomenclators, an IPNI lsid for example. But Im not sure if that is  
adding any value to a taxon definition or even identification

>> In case we have 2 ID terms, one for the name (scientificNameID) and
>> one for the concept (taxonConceptID), this maps nicely to TCS. The
>> reason why we favoured nameUsageID over taxonID was that a taxon by
>> definition is accepted.
> In the context of an identification, yes, a taxon is asserted to be  
> valid/accepted by the identifier (at the time), but not all  
> identifications are accepted by the data manager, so that last  
> statement isn't always true.  Also not all taxa are accepted/valid  
> within a classification (if it includes synonymous taxa).
> I think "nameUsageID" is too obscure for general use.  It makes  
> sense, but only after understanding the "name-according-to"  
> concept.  I think TaxonID along with (and therefore different from)  
> TaxonNameID make a better combination.  TaxonConceptID also works  
> well.
>> And it would therefore be awkward to use this
>> term as the ID for a synonym or misapplied name - unless it refers to
>> the accepted taxon and is not a "primary key". If we call it
>> taxonConceptID this might be less of a problem. TCS does the same and
>> uses the taxon concept ID as a primary key for classic synonyms aka
>> nominal taxon concepts.
> I don't follow the "primary key" point, but I think we agree about  
> the taxonConceptID as the preferred (or acceptable) name for this  
> term.
by primary key I meant uniqueness for every record in a taxonomic  
dataset that also includes synonyms and other non accepted "taxa".  
Some people would argue a synonym doesnt have a concept and therefore  
only accepted taxa have a taxonID. If we agree that this is not the  
case then all is fine and I would actually prefer to stick with  
taxonID instead of the unneccessary longer  taxonConceptID.

>> My biggest concern with going for actually two classes (name & taxon
>> concept) and having 2 IDs for them is what do other ID terms like
>> originalNameID, higherTaxonID, acceptedTaxonID refer to? It might
>> cause confusion in case the IDs are not GUIDs - which I think we will
>> have to still live with for quite some time. On the other hand this
>> might be the price to pay for being exact and the terms quite clearly
>> are called either xxxNameID or xxxTaxonID...
> If we are moving from distributed query to harvesting and querying  
> against a large integrated repository, the need for most of the  
> higher taxonomic names goes away.  We put them in the original DwC  
> to support querying.  In a large integrated repository, the higher  
> classification used by the provider will be stripped off, won't it?   
> All that will matter is the lowest level (most precise) taxon used  
> in the identification(s).
the higher classification is very useful for both occurrence  
identifications and a taxonomic dataset. In both cases it helps  
identifying canonical homonyms.
And for the taxonomic datasets the classification is a central part of  
the infromation. Having explicit terms for higher ranks like  
dwc:kingdom, dwc:higherTaxon for the verbatim name string of a higher  
taxon at any rank and dwc:higherTaxonID to use unique ids to transmit  
this hierarchy are just several options for the same thing. It helps  
the publishers while the consumers have to deal with more complexity,  
but to a tolerable degree I would think. In my applications so far I  
tend to implement a precedence higherTaxonID > higherTaxon > KPCOFGS  
in case several of those terms are present.

> Are we still trying to support the distributed query function?
> -Stan
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content

More information about the tdwg-content mailing list