Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad? [SEC=UNCLASSIFIED]

24 Nov 2010

      Rich

Your two statements below don't jibe well in this case.   Putting  
random concatenations of higher taxa into dwc:higherClassification  
would make for a real mess.   Having only the basic named Linnaean  
ranks does ignore all of the intermediate ranks but it supports  
conformity at least for those in a way that higherClassification  
cannot as you lose the associated rank term.   It also supports what I  
think is a fairly substantial bloc of data that exists in a denormal  
form with only (or nearly only) the basic Linnaean ranks in named rank  
columns.   Concatenating these into dwc:higherClassification would be  
lossy in this case.

My real concern, however, would be in trying to subsequently line up  
multiple datasets where there were omissions in some higher ranks so  
that the concatenations were abbreviated.   In other words

Bivalvia:Mytildae:Mytilus edulis
Mollusca:Mytiloidea:Mytilus: Mytilus edulis
Animalia: Mollusca:Mytiloidea:Mytildae:Mytilus: Mytilus edulis

See http://code.google.com/p/gbif-ecat/wiki/Nom5ExampleMytilusedulis   
for a real world example and, ignoring the other inherent messes,   
imagine trying to deal with this with no higher rank columns for  
context and all those nulls removed (no fair keeping the delimiters  
for them either).

DR
...
On 25/11/2010, at 11:56 AM, Richard Pyle wrote:
...
One golden rule of data management that I often
tell people is that it's often better to be consistent, then  
correct.  That
is, something that's consistently incorrect can be corrected easily.

...
Right -- you mean in the sense of Family, Order, Class, etc.   
Personally, I
think it would be "ideal" to eliminate these individual fields and  
just use
dwc: higherClassification for this purpose.  People with normalised  
data can
represent it properly via parentNameUsage[ID] -- with the  
understanding that
all names with a rank lower than genus would include the genus name as
uninomial.

Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad? [SEC=UNCLASSIFIED]

David Remsen (GBIF)