Rod Page wrote:
The issue of centralisation and "taxonomic freedom" is a red herring (due to Rich and I talking past each other -- my fault).
I share the blame.
My concern (and I know I'm doing a very poor job of explaining this) is that administrative layers and attempting to harmonise things centrally may ultimately prove to get in the way of getting things done.
I agree that it might, but I don't think it necessarily has to. Again, I point to IP addresses and DNS. They are obviously harmonized, but aren't locked up in administrative layers (or are they???)
Arthur Chapman wrote:
To some extent I think Sally is right, what she suggests wouldn't improve things with respect to duplicate specimens however- specimens of the same collection sent to more than one (and some times many) different institutions.
And Sally Hinchcliffe responded:
regarding duplicates, to my mind, there's a very strong connection between the duplicate of 'Chapman 541' held at Kew and the one held at Canberra (assuming there is such a thing). But they aren't the same thing (and shouldn't have the same id) - a photograph of Kew's duplicate would look nothing like a photograph of Canberra's duplicate for example. I'm not sure whether GUIDs are going to help us solve the problem of duplicates although I agree that it's desirable. What _could_ have the same GUID is the collecting event (Chapman taking a specimen) that gave rise to the two specimens ... but who or what would issue or control _those_ GUIDs?
It depends on what object the GUID is attached to: the preserved specimen in a Museum; or the individual organism from what the specimen was extracted. If the former, they need separate GUIDs. If the latter, they may still need separate "Museum specimen" GUIDs (for tracking museum-specimen-related stuff), but may also need to both point to a common "individual organism" GUID and/or a common "collecting event" GUID.
Chuck Miller wrote:
An issue that needs to be decided by the workshop is how much "abstraction" of a GUID is absolutely necessry if it must also be locatable through the Internet? That is, is a compromise needed to allow embedding of domain names in order to enable use of the DNS to locate GUIDs. Surely there is insufficient time, funds, and staff to embark upon creation of a master switchboard (database) where the locations of millions of GUIDs are recorded and updated in perpetuity.
I think we can have the best of both worlds, and I don't think we need a master switchboard keeping track of the locations of millions of GUIDs. If GBIF or TDWG or some other similar organisation would simply provide a service where anyone can self-register and reserve a block of 64-bit integers, then those of us who wish to do so may substitute our own internal ID numbers with corresponding values from the reserve block. That way, we can present them as part of an LSID, or handle system, or the terminal bit of whatever GUID system is used, knowing with confidence that nobody else out there will use the same integer for any ID value within the TDWG/GBIF/bioinformatics universe. The organization in charge of keeping track of which clients reserved which blocks of numbers would *not* need to track what data objects those individual numbers are assigned to. In fact, I suspect that the majority of those numbers would never get assigned to anything. But those that do get assigned, would be guaranteed to be unique (by themselves) within the TDWG/GBIF/bioinformatics scope. With that in place, an LSID or DOI-like Handle GUID would serve as a contextual "wrapper" around a core unique ID number (a number that could transcend any particular GUID resolution system).
Donald Hobern wrote:
...a very helpful and thoughtfully-framed post, which I will think about during my plane ride tonight.
See you all in a couple days....
Aloha, Rich