[tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad?

Markus Döring m.doering at mac.com
Wed Nov 24 13:20:28 CET 2010


>> B) ...If they concatenate the authorship string with the taxon name
>> string when populating dwc:scientificName, then the consumer has no easy
>> way
>> of extracting the name bits from the authorship bits
> 
> Exactly! Wearing my data consumer hat, the first thing I need to do with current dwc:ScientifiName content from multiple sources is try to generate canonical names by stripping off what appear to be authorities (hopefully successfully but not guaranteed). If there was an extra field populated in all or even a subset of cases, this task would not be required.
> 
> So, I think the mnain driver for this has to be from the large scale data consumers - GBIF, OBIS (with which I am associated), EOL, ALA etc. - if they would find such a field useful that is the real test. In my other incarnation as a data supplier, I can concatenate everything into scientificname as per the present DwC spec, no problem, it just is a lossy export when it is received as far as I am concerned.

>From GBIFs point of view there is no problem at all with using the full scientific name as it is. 
In fact my preferred solution would be to only have to look into scientificName and nowhere else! Less options are superior.

Also nearly all datasets have a mix of canonical and "qualified" scientific names, so I am sure they will find it hard to populate canonicalName only with canonicals and scientificName only with names with authorship. I bet finally we would still have to check for all options, dealing with canonicals in scientificName, potentially having inconsistencies between canonicalName + authorship and scientificName. It would also be harder to define a single required term. If I supply the canonicalName already, do I still have to populate the scientificName? Even if I only have the canonical? If I have a non parsed full name, how will I be able to fill the canonical? From my point of view its not getting any easier.

Markus




>> -----Original Message-----
>> From: Richard Pyle [mailto:deepreef at bishopmuseum.org]
>> Sent: Wednesday, 24 November 2010 8:53 AM
>> To: 'Chuck Miller'; Rees, Tony (CMAR, Hobart); dremsen at gbif.org
>> Cc: tdwg-content at lists.tdwg.org; dmozzherin at eol.org
>> Subject: RE: [tdwg-content] [tdwg-tag] Inclusion of authorship in
>> DwCscientificName: good or bad?
>> 
>> 
>>> What is the specific objection to adding canonicalName to DwC
>>> as an optional element, other than the fact it makes DwC one
>>> thing larger?
>> 
>> I don't have an objection to it per se, but I'd like to feel more certain
>> that I understand exactly what it is, and what it is intended to achieve,
>> that is not already achievable with existing terms and/or couldn't be more
>> achievable with an alternative solution. I think there is value in
>> avoiding
>> feature-creep with DwC, except when we can solve a real problem with the
>> existing terms. I agree there is a problem there, but I'm still struggling
>> to understand exactly what specific problem that something like
>> canonicalName will solve.
>> 
>>> There are databases which do not have their names parsed and
>>> provide whatever they have recorded as ScientificName.  But,
>>> there are also databases which do have parsed names and could
>>> provide this more narrowly defined element, in addition to
>>> the ScientificName.  Those databases could make use of a
>>> dwc:canonicalName element in their data exchange or query response.
>> 
>> Right -- but the point is this: if the data are already parsed, where is
>> the
>> failure of the existing DwC terms in providing the desired service?  We've
>> already identified one of those: i.e., that "intermediate" uninomial ranks
>> not supported by existing DwC terms don't have a place to put the
>> canonical
>> form of the name (other than scientificName, which isn't currently
>> intended
>> or required to be canonical). So yes, that's a clear problem in need of a
>> soultion. But is a generic canaonicalName term really going to solve that
>> efficiently/effectively? What other problems might canonicalName solve?
>> 
>>> What we don't have and I think never will have is perfectly
>>> consistent names data from every database in the world.  One
>>> reason is a mountain of inconsistently recorded legacy data
>>> from decades past that stands in the way of perfection.
>>> Another is variation in convention or tradition for a variety
>>> of reasons that have been explored in these recent threads.
>>> So, I think the pragmatic approach is to accept the
>>> inconsistencies and work around them.
>> 
>> Agreed!  And my questions are:
>> 
>> 1) What specific problems with existing DwC do we wish to solve?
>> 2) How best to solve them?
>> 
>> I'll list two examples for #1:
>> 
>> A) Representing the canonical (sans-authorship) form of a uninomial name
>> at
>> a rank not already represented by existing rank-specific DwC terms
>> (kingdom,
>> phylum, class, order, family, genus)
>> Because the current definition of dwc:scientificName allows (optionally)
>> the
>> inclusion of authorship information, there is no clean way to represent a
>> uninomial name in a way that expressly excludes authorship -- except if
>> the
>> uninomial name happens to be represented at the rank of kingdom, phylum,
>> class, order, family, or genus.
>> 
>> B) Content providers who have authorship data in a separate field from
>> taxon
>> name data, but who have not parsed the bits of a taxon name string
>> In this case, the provider cannot provide the parsed bits of the name, but
>> can provide a (sort of) canonicalName string separately from an authorship
>> string.  If they concatenate the authorship string with the taxon name
>> string when populating dwc:scientificName, then the consumer has no easy
>> way
>> of extracting the name bits from the authorship bits (unless the provider
>> also provides dwc:scientificNameAuthorship, wich could be exactly removed
>> from the dwc:scientificName valu, yielding what the provider would have
>> otherwised provided as canonicalName. Or, as David suggested, in this case
>> the Authorship text would not be concatenated with scientificName.
>> 
>> I would like to know some other problems that could be solved with the
>> addition of a canonicalName term before I start commenting on #2.
>> 
>> Aloha,
>> Rich
>> 
> 
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content



More information about the tdwg-content mailing list