[tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad?

Richard Pyle deepreef at bishopmuseum.org
Thu Nov 25 09:40:31 CET 2010


> This all sounds like it's getting terribly complicated and the combined
> discussion on atomised parts vs canonical/full-name are confusing me.

You're not alone! :-)

> For the first part,  I still think available parsing tools make 99.8% of
all cases
> tractable but if we want to be explicit and not run these services all the
time
> then.

I can go with that (basically what I've been advocating all along:  that is,
we need to identify where it's broke before we try to fix it).  At the
moment, many data consumers don't have those tools available locally, but I
think as more people become familiar with the tools that are out there, then
this will not be so much a problem.  In cases where scientificName is
concatenated from parsed bits, then I suspect the client-side parsing will
be closer to 100% effective.

> For datasets that separate the name from the authorship we make sure it's
> clear this separation is retained.  The definition for this term
> must change.   It doesn't make sense to me to concatenate two elements
> that are already split.   The parts go into:
> 1. scientificName + scientificNameAuthorship
> 
> For datasets with only a scientificName.  The name goes into:
> 2. scientificName
> For datasets with scientificName and authorship in a single field we have
two
> choices:
> 3a. scientificName  # in which case we must be able to detect and split
> authorship and we need to detect the canonical form in case 2 3b.
> scientificNameWithAuthorship # rather than a canonicalName term which is
> confusing we use a less ambiguous term like this.
> 
> It seems to me the intent of 3b is more explicit as to what we intend by
> adding canonicalName.

The thing that concerns me about 3a is the conditional definition of
scientificName (i.e., it excludes authorship when that information is
pre-parsed, but includes it when it's not pre-parsed).

The thing that concerns me about 3b is that scientificName is (supposedly?)
required; so if someone with unparsed information provides
scientificNameWithAuthorship, then are they supposed to leave scientificName
blank?

Rich




More information about the tdwg-content mailing list