It is my understanding that GBIF intends to split the verbatim scientific name and authorship in their processing.
So those millions of records will be split.
They will probably also provide a service that will split the names for you.
This clarification of scientificName just makes this more clear all the way down.
It also is a better match for how scientific name is used in most publications and related knowledge bases like Wikipedia, NCBI Taxonomy and eBird (probably Wikispecies too).
The parsers will need to look for well formed strings and in the interim they can split it if needed.
If they are already split it makes it much easier to determine what are correctly formed strings and what are not.
This reminds me of the many ways you can format a bibliographic citation. What some people seem to want is to have everyone adopt their citation format. What makes more sense it to keep the separate things separate and then combine them into whatever format is needed at the end.
It makes sense for end users to keep the authorship string is a separate field anyway.
How do they search for all the descriptions that might be tied to same publication etc.?
Respectfully,
- Pete
On Thu, Dec 9, 2010 at 11:28 AM, Chuck Miller Chuck.Miller@mobot.orgwrote:
Perhaps we need to add a "rule" element as Bob Morris has suggested. Then with that additional fact, the usage of the other terms would be specifically declared by the provider and all this assumption/inferencing would not be needed, where the declaration of the rule was provided.
But, millions of rows of legacy data may never conform to anything done at this point. If the meaning of ScientificName is altered by a definitional change after 10 years of the DarwinCore term being used with a different definition, no doubt the end result will be even more world-wide data hegemony because there will not be a sudden switchover of all the legacy data to the new definition. That herd of elephants is not going to turn quickly, so for some long time you really won't know what you have in a given ScientificName field - the old definition or the new.
Chuck
-----Original Message----- From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Richard Pyle Sent: Thursday, December 09, 2010 10:04 AM To: 'Gregor Hagedorn' Cc: tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] proposed term: dwc:verbatimScientificName
Rich, it is not a question of __formatting__; concatenation is just not possible, you have to parse into EVERY name, take it apart, determine whether it is an autonym, and if so __insert__ the author at
the correct position for botanical names.
Yes, but that's the responsibility of the provider. Either they have the information sufficiently atomized to populate verbatimScientificName appropriately for autonyms, or they just have a pre-formatted "scientificNameWithAuthorship" (which can go in verbatimScientificName), or they do not have autonyms appropriately formatted, in which case we can't really do anything for them.
Thus, the expected content would be: verbatimScientificName: Lobelia spicata Lam. var. spicata scientificName: Lobelia spicata var. spicata scientificNameAuthorship: Lam.
I understand this is tough on Zoologists :-), but I therefore propose
Actually, it's the botanists who are making things tough in this case... :-)
verbatimScientificName scientificName scientificNameAuthorship, scientificNameWithAuthorship
This covers all cases in my opinion. The comments should express, that
scientificNameWithAuthorship should follow allow canonical name rules and recommendations of the respective Code.
I'm still not convinced we need scientificNameWithAuthorship.
Aloha, Rich
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content