Stan, thanks for the very clear summary, I think you are pretty much right, though Im not sure if I understand you correctly when saying we should differentiate between describing taxa and using them for identifications. Do you suggest to have a completely seperate set of terms for the two purposes?
In case we have 2 ID terms, one for the name (scientificNameID) and one for the concept (taxonConceptID), this maps nicely to TCS. The reason why we favoured nameUsageID over taxonID was that a taxon by definition is accepted. And it would therefore be awkward to use this term as the ID for a synonym or misapplied name - unless it refers to the accepted taxon and is not a "primary key". If we call it taxonConceptID this might be less of a problem. TCS does the same and uses the taxon concept ID as a primary key for classic synonyms aka nominal taxon concepts.
In regards of GUIDs, yes, I would hope that all IDs are GUIDs, but I wouldn't mandate them to be, just recommend it. Required should only be that they are consistent within a dataset.
My biggest concern with going for actually two classes (name & taxon concept) and having 2 IDs for them is what do other ID terms like originalNameID, higherTaxonID, acceptedTaxonID refer to? It might cause confusion in case the IDs are not GUIDs - which I think we will have to still live with for quite some time. On the other hand this might be the price to pay for being exact and the terms quite clearly are called either xxxNameID or xxxTaxonID...
Markus
On Aug 25, 2009, at 10:45, Blum, Stan wrote:
If we're striving for compatibility with previous work wherever possible, the TNC schema should be the appropriate "higher" authority when it comes to names and concepts. Departure from that should come only if we're aware of a problem. (We would be justified in abandoning previous DwC terms in order to promote consistency with TNC.)
If the new DwC is supposed to transport purely taxonomic data, and not just identifications of specimens and observations, then I think DwC needs to include at least the most important elements in TNC. (BTW, in the spirit of acknowledging prior art, my earliest knowledge of this distinction, in conjunction with explicit information modeling methods, goes back to 1993 [Beach, Pramanik, and Beaman 1993; as well as Croft and Whitbread pers. comm.])
I understood that TNC distinguishes taxonomic concepts (taxa) from taxonomic names as the biological entities versus the names that refer to them. If a taxonomic name is admitted to the domain of taxa (taxonomic concepts), it's the case referred to as the nominal concept, which I understood to be the union of taxa ever having born that name.
Taxonomic identifications -- the relationship between an existing concept (taxon) and an organism or a particular group of organisms -- should refer to a taxonomic concept, even if the name used does not include enough information to reference a specific concept. This is in contrast to the application of a name (existing or new) to a taxon that is intended to be (published as) a taxonomic or nomenclatural act. The designation of a type specimen, for example, is not an identification. Perhaps naively, I think it's worth distinguishing identifications and the representation of taxonomic acts.
So in the context of transporting purely taxonomic data we have the need for: 1) a taxonomic name, 2) the concept it designates, 3) the ID of that concept, and 4) the ID of the name itself, if we accept that it has properties other than the string and the string itself does not uniquely identify the "name" in the context of taxonomic nomenclature (i.e., it can be modifed for gender agreement and still be the same thing). When dealing with identifications, however, taxonomic names are just supposed to be part of the short hand for referencing a taxonomic concept; so a taxonID (=taxonConceptID) is needed, but not a taxonNameID, except in the case we want to reveal that a specimen has been designated the type of a particular name and the name can be referenced with an actionable GUID. But again, I don't think of this later case as an identification.
Do I have the concepts and distinctions correct?
Am I right in assuming that all of these various IDs are supposed to be actionable GUIDs?
-Stan
From: tdwg-content-bounces@lists.tdwg.org [tdwg-content-bounces@lists.tdwg.org ] On Behalf Of John R. WIECZOREK [tuco@berkeley.edu] Sent: Monday, August 24, 2009 10:00 PM To: Markus Döring (GBIF) Cc: tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] DwC taxonomic terms
Right, that all makes sense now, and is exactly the kind of simplification that was already in place in the Location class, where the locationID refers to the Location as a whole, not some part of it, such as a country in one case or a city in another case. So, I agree, remove the taxonConceptID.
I've been struggling with trying to come up with a better term name than nameUsage. After reading the arguments again with every alternative I can come up with (scientificName, taxonName, taxon_name, nameAsUsed, nameAsPublished, publishedName, publishedTaxon) I'm not sure I can really do any better for a name that states specifically what you are trying to encompass with that term. Nevertheless, the term seems awkward, especially on first encounter. The terms would have to be very carefully described (but I guess all terms should be). The problem is, I think the same problem with recognizing what the term is for would happen on the second encounter as well ("What was that term for again?"). I don't think that would happen with terms that were more familiar, even if their meaning is broad. To me, "taxon" works, because it could be a name or a concept - exactly what we're trying to encompass.
So here's what I'd do in an attempt to be clear, concise, and consistent.
Given that the Class is Taxon (which captures the idea of a name as well as it does a concept), consistency would argue that the id term for a record of the class should be taxonID. The list of terms under this scenario would be: taxonID, acceptedTaxonID, higherTaxonID, originalTaxonID, scientificName, acceptedTaxon, higherTaxon, originalTaxon, higherClassification, kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet, taxonRank, verbatimTaxonRank, scientificNameAuthorship, nomenclaturalCode, taxonPublicationID, taxonPublication, taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName.
I retained "scientificName " for two big reasons. First, the obvious alternative "taxon" would be too easily confused with the name of the Class "Taxon". Second, scientificName has broad current usage and will immediately suggest the appropriate content for most users. An additional minor reason is that the term contrasts with and is nicely consistent with "vernacularName".
The rest is all dependent on good definitions. Here are some drafts for new definitions for terms that need them. Please suggest any necessary revisions.
taxonID: An identifier for a specific taxon-related name usage (a Taxon record). May be a global unique identifier or an identifier specific to the data set.
acceptedTaxonID: A unique identifier for the acceptedTaxon.
higherTaxonID: A unique identifier for the taxon that is the parent of the scientificName.
originalTaxonID: A unique identifier for the basionym (botany), basonym (bacteriology), or replacement of the scientificName.
scientificName: The taxon name (with date and authorship information if applicable). When forming part of an Identification, this should be the name in the lowest level taxonomic rank that can be determined. This term should not contain Identification qualifications, which should instead be supplied in the IdentificationQualifier term.
acceptedTaxon: The currently valid (zoological) or accepted (botanical) name for the scientificName.
higherTaxon: The taxon that is the parent of the scientificName.
originalTaxon: The basionym (botany), basonym (bacteriology), or replacement of the scientificName..
higherClassification: A list (concatenated and separated) of the names for the taxonomic ranks less specific than that given in the scientificName.
kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet - all unchanged.
taxonRank: The taxonomic rank of the scientificName. Recommended best practice is to use a controlled vocabulary.
verbatimTaxonRank: The verbatim original taxonomic rank of the scientificName.
scientificNameAuthorship, nomenclaturalCode - unchanged
taxonPublicationID: A unique identifier for the publication of the Taxon.
taxonPublication: A reference for the publication of the Taxon.
taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName - unchanged. _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content