What we *REALLY* need is a globally shared system for representing taxon names either as verbatim strings (we could call it something like "Grand Names Index", or "Glorious Nomenclatural Inventory", or something like that), or as highly parsed, atomized data objects corresponding to specific usages of taxon names (we could call it the "Global Nomenclatural Utility Base", or "Great Names Universal Bucket") coordinated in such a way that they interact with each other in a common architecture (maybe we could call it the "Guttenberg Names Analog").
In any case, I think the real solution is to get to the point where we need only two terms for taxonomic data:
taxonID verbatimScientificName
As chuck says, it's a tall order...but it sure would reduce the traffic on the tdwg-content list....
:-)
Aloha, RIch
-----Original Message----- From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content- bounces@lists.tdwg.org] On Behalf Of Chuck Miller Sent: Thursday, December 09, 2010 5:55 AM To: David Remsen (GBIF); Bob Morris Cc: tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] proposed term: dwc:verbatimScientificName
Bob, Funny. It starts to sound like the "Levels" that were defined in that 94 plant names publication. They were wrangling back then with how to format the different ways the names could occur. We have some real tension going on between having a minimal set of terms while enabling a maximal variety of use cases. I think something has to give somewhere. The use cases are not going away. So, maybe we need to accept a longer
set
of terms. One of them might be a "rule used" element as you suggest.
But,
that would mean the creation of the rules, definition of controlled
vocabulary
and then implementation into DwC somehow. Tall order but it may be something lacking in DwC that is causing all these late night emails to
fly back
and forth seemingly endlessly.
Chuck
-----Original Message----- From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of David Remsen (GBIF) Sent: Thursday, December 09, 2010 9:06 AM To: Bob Morris Cc: tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] proposed term: dwc:verbatimScientificName
The GBIF web parser service is a good step in that direction.
http://tools.gbif.org/nameparser/
Select the test names and review the extended output.
David On Dec 9, 2010, at 3:09 PM, Bob Morris wrote:
...
Obviously, it would be nice if algorithms did exist which could convert a text string into a scientific name, but this still lies in the future.
For those of us attempting to populate databases with information extracted from published literature, the future is now. It seems to me
that normalizing the extraction to some standardized form \before/ putting it in the database is more robust than forcing the parsing to be done afterwards. So we need rules for those forms, and an unambiguous way in our metadata to cite which rules have been followed. In a previous post my p.s. also whined about a similar need
for born-digital taxonomic treatments.
Bob
-- Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: morris.bob@gmail.com web: http://bdei.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram phone (+1) 857 222 7992 (mobile) _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content