[tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad?
Richard Pyle
deepreef at bishopmuseum.org
Tue Nov 23 22:52:38 CET 2010
> What is the specific objection to adding canonicalName to DwC
> as an optional element, other than the fact it makes DwC one
> thing larger?
I don't have an objection to it per se, but I'd like to feel more certain
that I understand exactly what it is, and what it is intended to achieve,
that is not already achievable with existing terms and/or couldn't be more
achievable with an alternative solution. I think there is value in avoiding
feature-creep with DwC, except when we can solve a real problem with the
existing terms. I agree there is a problem there, but I'm still struggling
to understand exactly what specific problem that something like
canonicalName will solve.
> There are databases which do not have their names parsed and
> provide whatever they have recorded as ScientificName. But,
> there are also databases which do have parsed names and could
> provide this more narrowly defined element, in addition to
> the ScientificName. Those databases could make use of a
> dwc:canonicalName element in their data exchange or query response.
Right -- but the point is this: if the data are already parsed, where is the
failure of the existing DwC terms in providing the desired service? We've
already identified one of those: i.e., that "intermediate" uninomial ranks
not supported by existing DwC terms don't have a place to put the canonical
form of the name (other than scientificName, which isn't currently intended
or required to be canonical). So yes, that's a clear problem in need of a
soultion. But is a generic canaonicalName term really going to solve that
efficiently/effectively? What other problems might canonicalName solve?
> What we don't have and I think never will have is perfectly
> consistent names data from every database in the world. One
> reason is a mountain of inconsistently recorded legacy data
> from decades past that stands in the way of perfection.
> Another is variation in convention or tradition for a variety
> of reasons that have been explored in these recent threads.
> So, I think the pragmatic approach is to accept the
> inconsistencies and work around them.
Agreed! And my questions are:
1) What specific problems with existing DwC do we wish to solve?
2) How best to solve them?
I'll list two examples for #1:
A) Representing the canonical (sans-authorship) form of a uninomial name at
a rank not already represented by existing rank-specific DwC terms (kingdom,
phylum, class, order, family, genus)
Because the current definition of dwc:scientificName allows (optionally) the
inclusion of authorship information, there is no clean way to represent a
uninomial name in a way that expressly excludes authorship -- except if the
uninomial name happens to be represented at the rank of kingdom, phylum,
class, order, family, or genus.
B) Content providers who have authorship data in a separate field from taxon
name data, but who have not parsed the bits of a taxon name string
In this case, the provider cannot provide the parsed bits of the name, but
can provide a (sort of) canonicalName string separately from an authorship
string. If they concatenate the authorship string with the taxon name
string when populating dwc:scientificName, then the consumer has no easy way
of extracting the name bits from the authorship bits (unless the provider
also provides dwc:scientificNameAuthorship, wich could be exactly removed
from the dwc:scientificName valu, yielding what the provider would have
otherwised provided as canonicalName. Or, as David suggested, in this case
the Authorship text would not be concatenated with scientificName.
I would like to know some other problems that could be solved with the
addition of a canonicalName term before I start commenting on #2.
Aloha,
Rich
More information about the tdwg-content
mailing list