Hi Michael,
a) Growth of the specimen b) Sampling of the specimen c) Labeling of the specimen (locality, date, leg, etc.) d) Conserving of the specimen e) Determination f) Digitisation (in house database) g) Linking up the information to a database network h) Analysis of the linked up specimen information i) Revision (loop back to point f))
Just two simple question: Where exactly should an artificial GUID be implemented?
I would suspect somewhere between steps f & g (I'd lean more towards f). But in many institutions (and in my own taxonomic work), step c is merged with step f; and may come before or after steps d & e.
What is the potential impact on the rest of the product life-cicle?
I would hope that it would result in increased efficiency & accuracy of steps g & h.
Learing a lot of latin names with a meaning is much more easy for humans then to learn some obscure artificial codes. Just try to rememeber 10 5 digit numbers! And no, not everybody is eager to run around on a field trip equiped like a high-tech soldier.
I think this is one of the most misunderstood aspects of GUIDs. They should *NOT* be optimized for human reading. I personally don't think they should ever even be viewed by a pair of human eyes (except the occassional database manager). The scientific names (and associated information) are what the Human should read. The GUIDs are intended for computers, which are extraordinarily adept at remembering billions of 5-digit numbers, let alone 10. But I would rather see the GUIDs as 19-digit (10^64) numbers.
Just a simple statement, as Peter already pointed out, by using Linnean code carefully and with high quality within our databases, we have already a working GUID in place! The thing Linne has already invented. IMHO it is only a question of database quality, whether the info given is a realy GUID or only a part of it.
I think it's less a question of database quality than of taxonomic quality. With a few very noteworthy exceptions, the taxonomic information we need for consistent reliance on "natural key" unique identifiers for names (Nomenclatural Code + Monomial/Binomial/Trinomial + Author(s)/Year/Page) is not organized to the point where this complex key can be used as a unique identifier by a computer with any reasonable degree of certainty and repeatability. As I said above, the GUIDs should be optimized for computer usage, not human usage.
Therefore I have the following suggestions for the 1.5 Million $ project:
- Build a schema, that enforces better overall quality, but not only
in the taxon information.
[...]
- Help and convince database holders to raise the quality of their
contributions 3) Enhance the database wrappers to speed up the network (I had to look into this for a GBIF Austria protal and found out, that the software can be improved in terms of speed just by rewriting some parts of it in C).
All excellent suggestions, and all of which would, I firmly believe, be enhanced through the establishment of universal GUIDs for biologically-relevant data objects.
Aloha, Rich
Richard L. Pyle, PhD Database Coordinator for Natural Sciences Department of Natural Sciences, Bishop Museum 1525 Bernice St., Honolulu, HI 96817 Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef@bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html