Rich, I gather your reason would be because it's unclear if anyone would actually use a canonicalName element? That is, it's unneeded.
Not exactly. I see a need, but I see other needs as well. I'd rather find a more stable and robust solution that allows people with both parsed and unparsed data to share what they have in a way that allows users to get what they need & want. My problem is not so much with canonicalName per se, but rather that it is unclear where the overlap and non-ovelap with scientificName is. It seems to me to be 80% overlap. As I said in my last post to Tony just now, I think the real problem is that scientificName is both core and loosely defined.
My feeling is that there should be *one* way to deal with completely unparsed data (e.g., a term that starts with "verbatim"), and *one* way for people to deal with parsed data. The problem is, there are different levels of parsed out there:
- fully unparsed with name bits and/or authorship in the same text string - authorship and name bits parsed - name bits parsed into "NameElement" pieces, with qualifiers stripped - name bits parsed into "NameElement" pieces, with qualifiers included - authorship parsed to names and/or years - authorship parsed to reference citation - etc.
You said you worry about feature creep. I suppose I worry about semantic creep. Extending the meaning of a term makes it more universal, but in a data world it increases the variability of the data that may be found attached to the term in some dataset. Imprecision in terms can create a lot of data quality headaches. Is that acceptable?
Exactly! This is getting at my point about scientificName being very liberal, but very essential (and overloaded). The addition of canonicalName would deal with a one or two problems, but I don't think it would solve the fundamental problem(s). It's more of a band-aid, than a cure, I guess.
Rich