[tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad?
Richard Pyle
deepreef at bishopmuseum.org
Wed Nov 24 01:15:29 CET 2010
> Rich,
> I gather your reason would be because it's unclear if anyone
> would actually use a canonicalName element? That is, it's
> unneeded.
Not exactly. I see a need, but I see other needs as well. I'd rather find
a more stable and robust solution that allows people with both parsed and
unparsed data to share what they have in a way that allows users to get what
they need & want. My problem is not so much with canonicalName per se, but
rather that it is unclear where the overlap and non-ovelap with
scientificName is. It seems to me to be 80% overlap. As I said in my last
post to Tony just now, I think the real problem is that scientificName is
both core and loosely defined.
My feeling is that there should be *one* way to deal with completely
unparsed data (e.g., a term that starts with "verbatim"), and *one* way for
people to deal with parsed data. The problem is, there are different levels
of parsed out there:
- fully unparsed with name bits and/or authorship in the same text string
- authorship and name bits parsed
- name bits parsed into "NameElement" pieces, with qualifiers stripped
- name bits parsed into "NameElement" pieces, with qualifiers included
- authorship parsed to names and/or years
- authorship parsed to reference citation
- etc.
> You said you worry about feature creep. I suppose I worry
> about semantic creep. Extending the meaning of a term makes
> it more universal, but in a data world it increases the
> variability of the data that may be found attached to the
> term in some dataset. Imprecision in terms can create a lot
> of data quality headaches. Is that acceptable?
Exactly! This is getting at my point about scientificName being very
liberal, but very essential (and overloaded). The addition of canonicalName
would deal with a one or two problems, but I don't think it would solve the
fundamental problem(s). It's more of a band-aid, than a cure, I guess.
Rich
More information about the tdwg-content
mailing list