[tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad? [SEC=UNCLASSIFIED]
Markus Döring
m.doering at mac.com
Thu Nov 25 09:11:08 CET 2010
the denormalised single Linnean Rank terms are very, very helpful for sharing occurrence data.
They are the primary means to distinguish between homonyms when only a canonical name is given.
And they are found in many denormalised sources like spreadsheets. No doubt these are needed!
And yes, dwc:genus and dwc:subgenus according to the definition is for the *classification*, not the parsed name (even though this is mostly the same).
As far as I can tell the dwc changes we are discussing are still the same. Either:
A) add a canonicalName term
or
B) add an atomised term for genus/uninomial + infrageneric/uninomial
I think both options are a way to go.
A single canonical name if given correctly is very straight forward to parse, so personally I think this is easier than having multiple terms.
For the name part terms I think I would agree with Chuck that a single uninomial can be used for genus or infrageneric ranks.
As a canonical binomial would *not* include a subgenus or section, there is not need to have that parsed information as a term.
In case the scientificname actually IS the subgenus, the uninomial can be used.
Markus
On Nov 25, 2010, at 8:36, David Remsen (GBIF) wrote:
> Rich
>
> Your two statements below don't jibe well in this case. Putting
> random concatenations of higher taxa into dwc:higherClassification
> would make for a real mess. Having only the basic named Linnaean
> ranks does ignore all of the intermediate ranks but it supports
> conformity at least for those in a way that higherClassification
> cannot as you lose the associated rank term. It also supports what I
> think is a fairly substantial bloc of data that exists in a denormal
> form with only (or nearly only) the basic Linnaean ranks in named rank
> columns. Concatenating these into dwc:higherClassification would be
> lossy in this case.
>
> My real concern, however, would be in trying to subsequently line up
> multiple datasets where there were omissions in some higher ranks so
> that the concatenations were abbreviated. In other words
>
> Bivalvia:Mytildae:Mytilus edulis
> Mollusca:Mytiloidea:Mytilus: Mytilus edulis
> Animalia: Mollusca:Mytiloidea:Mytildae:Mytilus: Mytilus edulis
>
> See http://code.google.com/p/gbif-ecat/wiki/Nom5ExampleMytilusedulis
> for a real world example and, ignoring the other inherent messes,
> imagine trying to deal with this with no higher rank columns for
> context and all those nulls removed (no fair keeping the delimiters
> for them either).
>
> DR
>
>
>>
>> On 25/11/2010, at 11:56 AM, Richard Pyle wrote:
>>
>>> One golden rule of data management that I often
>>> tell people is that it's often better to be consistent, then
>>> correct. That
>>> is, something that's consistently incorrect can be corrected easily.
>
>
>> Right -- you mean in the sense of Family, Order, Class, etc.
>> Personally, I
>> think it would be "ideal" to eliminate these individual fields and
>> just use
>> dwc: higherClassification for this purpose. People with normalised
>> data can
>> represent it properly via parentNameUsage[ID] -- with the
>> understanding that
>> all names with a rank lower than genus would include the genus name as
>> uninomial.
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
More information about the tdwg-content
mailing list