[tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad? [SEC=UNCLASSIFIED]

David Remsen (GBIF) dremsen at gbif.org
Thu Nov 25 09:39:13 CET 2010


dwc:Genus

I think the definition "The full scientific name of the genus in which  
the taxon is classified." is incomplete and only makes sense for valid/ 
accepted taxon names.  I think the definition should be changed so  
that the dwc:genus refers to the genus part of the name.   For some  
synonyms,  the genus part is different.   In this case,  why should  
the genus part refer to the genus of the accepted/valid taxon?  It is  
already linked to that taxon via other methods and would inherit that  
information through the link.   It's just an opportunity to create  
integrity conflicts as well as an opportunity to lose some valuable  
additional information.

Consider this record in the WoRMS database concerning my favorite fish:

http://www.marinespecies.org/aphia.php?p=taxdetails&id=301162

Note that the parent (genus) for this synonym is the literal, nominal  
parent genus,  not the genus for the valid name.   Given the degree of  
homonymy among the genera this could provide useful and explicit  
linking to a parent genus records,  particularly if it were included,   
like in this case, in the source dataset.   The value in either cases  
is limited in the larger aggregate world due to the recommendation  
that dwc:Genus, like all the named higher taxon elements,  be canonical.

On a related note then,  I would recommend that for synonyms,  the  
more normal and enriched dwc:parentNameUsageID should be used to  
retain this information.   In other words,  normally, a synonym is  
linked to the accepted/valid taxon via acceptedNameUsageID and  
dwc:parentNameUsageID is null.  In this case, however, it should be  
used to

DR

On Nov 25, 2010, at 9:11 AM, Markus Döring wrote:

> the denormalised single Linnean Rank terms are very, very helpful  
> for sharing occurrence data.
> They are the primary means to distinguish between homonyms when only  
> a canonical name is given.
> And they are found in many denormalised sources like spreadsheets.  
> No doubt these are needed!
>
> And yes, dwc:genus and dwc:subgenus according to the definition is  
> for the *classification*, not the parsed name (even though this is  
> mostly the same).
>
> As far as I can tell the dwc changes we are discussing are still the  
> same. Either:
>
> A) add a canonicalName term
> or
> B) add an atomised term for genus/uninomial + infrageneric/uninomial
>
> I think both options are a way to go.
> A single canonical name if given correctly is very straight forward  
> to parse, so personally I think this is easier than having multiple  
> terms.
> For the name part terms I think I would agree with Chuck that a  
> single uninomial can be used for genus or infrageneric ranks.
> As a canonical binomial would *not* include a subgenus or section,  
> there is not need to have that parsed information as a term.
> In case the scientificname actually IS the subgenus, the uninomial  
> can be used.
>
>
> Markus
>
> On Nov 25, 2010, at 8:36, David Remsen (GBIF) wrote:
>
>> Rich
>>
>> Your two statements below don't jibe well in this case.   Putting
>> random concatenations of higher taxa into dwc:higherClassification
>> would make for a real mess.   Having only the basic named Linnaean
>> ranks does ignore all of the intermediate ranks but it supports
>> conformity at least for those in a way that higherClassification
>> cannot as you lose the associated rank term.   It also supports  
>> what I
>> think is a fairly substantial bloc of data that exists in a denormal
>> form with only (or nearly only) the basic Linnaean ranks in named  
>> rank
>> columns.   Concatenating these into dwc:higherClassification would be
>> lossy in this case.
>>
>> My real concern, however, would be in trying to subsequently line up
>> multiple datasets where there were omissions in some higher ranks so
>> that the concatenations were abbreviated.   In other words
>>
>> Bivalvia:Mytildae:Mytilus edulis
>> Mollusca:Mytiloidea:Mytilus: Mytilus edulis
>> Animalia: Mollusca:Mytiloidea:Mytildae:Mytilus: Mytilus edulis
>>
>> See http://code.google.com/p/gbif-ecat/wiki/Nom5ExampleMytilusedulis
>> for a real world example and, ignoring the other inherent messes,
>> imagine trying to deal with this with no higher rank columns for
>> context and all those nulls removed (no fair keeping the delimiters
>> for them either).
>>
>> DR
>>
>>
>>>
>>> On 25/11/2010, at 11:56 AM, Richard Pyle wrote:
>>>
>>>> One golden rule of data management that I often
>>>> tell people is that it's often better to be consistent, then
>>>> correct.  That
>>>> is, something that's consistently incorrect can be corrected  
>>>> easily.
>>
>>
>>> Right -- you mean in the sense of Family, Order, Class, etc.
>>> Personally, I
>>> think it would be "ideal" to eliminate these individual fields and
>>> just use
>>> dwc: higherClassification for this purpose.  People with normalised
>>> data can
>>> represent it properly via parentNameUsage[ID] -- with the
>>> understanding that
>>> all names with a rank lower than genus would include the genus  
>>> name as
>>> uninomial.
>>
>> _______________________________________________
>> tdwg-content mailing list
>> tdwg-content at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20101125/a3ec0fd3/attachment.html 


More information about the tdwg-content mailing list