On Nov 25, 2010, at 9:11 AM, Markus Döring wrote:

the denormalised single Linnean Rank terms are very, very helpful for sharing occurrence data.
They are the primary means to distinguish between homonyms when only a canonical name is given.
And they are found in many denormalised sources like spreadsheets. No doubt these are needed!

And yes, dwc:genus and dwc:subgenus according to the definition is for the *classification*, not the parsed name (even though this is mostly the same).

As far as I can tell the dwc changes we are discussing are still the same. Either:

A) add a canonicalName term
or
B) add an atomised term for genus/uninomial + infrageneric/uninomial

I think both options are a way to go.
A single canonical name if given correctly is very straight forward to parse, so personally I think this is easier than having multiple terms.
For the name part terms I think I would agree with Chuck that a single uninomial can be used for genus or infrageneric ranks.
As a canonical binomial would *not* include a subgenus or section, there is not need to have that parsed information as a term.
In case the scientificname actually IS the subgenus, the uninomial can be used.

Markus

On Nov 25, 2010, at 8:36, David Remsen (GBIF) wrote:

Rich

Your two statements below don't jibe well in this case.   Putting
random concatenations of higher taxa into dwc:higherClassification
would make for a real mess.   Having only the basic named Linnaean
ranks does ignore all of the intermediate ranks but it supports
conformity at least for those in a way that higherClassification
cannot as you lose the associated rank term.   It also supports what I
think is a fairly substantial bloc of data that exists in a denormal
form with only (or nearly only) the basic Linnaean ranks in named rank
columns.   Concatenating these into dwc:higherClassification would be
lossy in this case.

My real concern, however, would be in trying to subsequently line up
multiple datasets where there were omissions in some higher ranks so
that the concatenations were abbreviated.   In other words

Bivalvia:Mytildae:Mytilus edulis
Mollusca:Mytiloidea:Mytilus: Mytilus edulis
Animalia: Mollusca:Mytiloidea:Mytildae:Mytilus: Mytilus edulis

See http://code.google.com/p/gbif-ecat/wiki/Nom5ExampleMytilusedulis
for a real world example and, ignoring the other inherent messes,
imagine trying to deal with this with no higher rank columns for
context and all those nulls removed (no fair keeping the delimiters
for them either).

DR

On 25/11/2010, at 11:56 AM, Richard Pyle wrote:

One golden rule of data management that I often
tell people is that it's often better to be consistent, then
correct. That
is, something that's consistently incorrect can be corrected easily.

Right -- you mean in the sense of Family, Order, Class, etc.
Personally, I
think it would be "ideal" to eliminate these individual fields and
just use
dwc: higherClassification for this purpose. People with normalised
data can
represent it properly via parentNameUsage[ID] -- with the
understanding that
all names with a rank lower than genus would include the genus name as
uninomial.

_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content