[tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad?

Richard Pyle deepreef at bishopmuseum.org
Wed Nov 24 01:15:29 CET 2010


> Rich,
> I gather your reason would be because it's unclear if anyone 
> would actually use a canonicalName element? That is, it's 
> unneeded. 

Not exactly.  I see a need, but I see other needs as well.  I'd rather find
a more stable and robust solution that allows people with both parsed and
unparsed data to share what they have in a way that allows users to get what
they need & want.  My problem is not so much with canonicalName per se, but
rather that it is unclear where the overlap and non-ovelap with
scientificName is. It seems to me to be 80% overlap.  As I said in my last
post to Tony just now, I think the real problem is that scientificName is
both core and loosely defined.

My feeling is that there should be *one* way to deal with completely
unparsed data (e.g., a term that starts with "verbatim"), and *one* way for
people to deal with parsed data.  The problem is, there are different levels
of parsed out there:

- fully unparsed with name bits and/or authorship in the same text string
- authorship and name bits parsed
- name bits parsed into "NameElement" pieces, with qualifiers stripped
- name bits parsed into "NameElement" pieces, with qualifiers included
- authorship parsed to names and/or years
- authorship parsed to reference citation
- etc.

> You said you worry about feature creep. I suppose I worry 
> about semantic creep. Extending the meaning of a term makes 
> it more universal, but in a data world it increases the 
> variability of the data that may be found attached to the 
> term in some dataset. Imprecision in terms can create a lot 
> of data quality headaches. Is that acceptable?

Exactly!  This is getting at my point about scientificName being very
liberal, but very essential (and overloaded).  The addition of canonicalName
would deal with a one or two problems, but I don't think it would solve the
fundamental problem(s).  It's more of a band-aid, than a cure, I guess.

Rich




More information about the tdwg-content mailing list