Re: Topic 3: GUIDs for Taxon Names and Taxon Concepts
There seems to be a great deal of confusion and misunderstanding in this discussion, and I apologize for whatever extent I am to blame. Several big issues are crossing paths. I don't know why the "centralized" issue cropped up, but it is, in my mind, tangential to (what should be) the main thrust of this discussion.
The idea that only the resolution system needs to be able to distinguish between specimens, taxon names, etc., seems unfortunate.
I agree.
Assigning GUIDs solely to basionyms strikes me as crazy -- I'd suggest most taxa aren't known by their basionyms. I'd advocate GUIDs for every name string (with the possible exception of orthographic variants, spelling mistakes, etc.).
By "Namestring", do you include author, our just taxonomic nomenclatural elements? In either case, why even bother establishing a GUID for taxonomic names at all? Why not just use the namestring itself?
Moreover, I don't think anyone has suggest that GUIDs be assigned "solely" to basionyms. The point is, there shouldn't be six different "units" to which GUIDs are assigned to represent "taxon names" [basionyms, combinations, namestrings, namestrings with authors, concept definitions, name usage instances, etc.]
Lastly, "imposing standards" is the wrong way to think about this. Standards win support if they work, and are adopted. I'd suggest this stuff will happen if people make compelling applications that others make use of, not because TDWG decides what should be done.
O.K., my choice of the word "imposing" was bad. I should have said something like "implementing". The one lesson that the web has taught us more vividly than decentralization is the power of standards for information exchange. We could have all invented our own version of DarwinCore for exposing specimen records to the internet, and allowed natural selection to take its course such that only the most useful few versions survived -- but I don't think that would have been the most effective path to where we are now (and where we could be soon, once DarwinCore V2 is ratified and mapped against ABCD elements).
So, my final question is, what is wrong with having each taxonomic database serve their own GUIDs for their own data (using an agreed format, such as RDF), and where possible GUIDs from different sources are mapped (e.g., a name string in IPNI to one in uBio). Users employ whatever GUID they find useful -- at least we then know what digital object they are referring to.
To me, GUIDs are only valuable and serve the information exchange needs of our community if they are, to some extent at least, reusable. I do not understand the value of assigning GUIDs to the records in my taxonomic database, if I am the only one who will ever use them. I understand the practical value of establishing "local" GUIDs to my taxon name records independently from other databases, if the long-term goal is to eventually map my GUIDs to the corresponding "equivalent" GUIDs of other databases. But that brings me back to the original point I've been trying to make: that mapping can only be done effectively if we have equivalent objects to map. Right now, there are a half-dozen competing ideas about what a "name" object is (or should be). Having spent the past year beating my head against a wall trying to map my local taxonomic database records to ITIS TSN's, I can speak with first-hand experience how utterly INefficient it can be to create such cross-walks when a "Name" record in ITIS means something different from a "Name" record in my database.
Donald started out this discussion by suggesting that "Taxon Name" objects might be thought of as fundamentally different from "Taxon Concept" objects (in an analagous way to how publication objects are fundamentally different from specimen objects). If you agree with this, then it would be nice if we could find a standard line to draw between these two objects, so that when we later try to cross-walk our respective datasets, we're mapping apples to apples, and oranges to oranges (rather than apples to bananas and then back to oranges -- which aptly represents what I'm having to resort to in mapping my taxonomic data records to ITIS records).
If we think of the scientific literature, many paper have at least two GUIDs (DOI and PubMed), both of which are useful, and which serve different purposes.
What is the value of expanding that to potentially hundreds, or thousands of GUIDs for each paper -- one GUID from each database that has a table of literature records? Wouldn't our information exchange be much more efficient if we all adopted (or at least mapped to) either the DOI or the PubMed GUID (or both)? If so, why doesn't this same logic apply to taxon names?
Aloha, Rich
Richard L. Pyle, PhD Database Coordinator for Natural Sciences and Associate Zoologist in Ichthyology Department of Natural Sciences, Bishop Museum 1525 Bernice St., Honolulu, HI 96817 Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef@bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html
participants (1)
-
Richard Pyle