Combined response

Mon Jan 30 09:56:51 CET 2006

Rod Page wrote:

> The issue of centralisation and "taxonomic freedom" is a red herring
> (due to Rich and I talking past each other -- my fault).

I share the blame.

> My concern
> (and I know I'm doing a very poor job of explaining this) is that
> administrative layers and attempting to harmonise things centrally may
> ultimately prove to get in the way of getting things done.

I agree that it might, but I don't think it necessarily has to.  Again, I
point to IP addresses and DNS.  They are obviously harmonized, but aren't
locked up in administrative layers (or are they???)

Arthur Chapman wrote:

> To some extent I think Sally is right, what she suggests
> wouldn't improve things with respect to duplicate specimens
> however- specimens of the same collection sent to more than one
> (and some times many) different institutions.

And Sally Hinchcliffe responded:

> > regarding duplicates, to my mind, there's a very strong connection
> > between the duplicate of 'Chapman 541' held at Kew and the one held
> > at Canberra (assuming there is such a thing). But they aren't the
> > same thing (and shouldn't have the same id) - a photograph of Kew's
> > duplicate would look nothing like a photograph of Canberra's
> > duplicate for example. I'm not sure whether GUIDs are going to help
> > us solve the problem of duplicates although I agree that it's
> > desirable. What _could_ have the same GUID is the collecting event
> > (Chapman taking a specimen) that gave rise to the two specimens ...
> > but who or what would issue or control _those_ GUIDs?

It depends on what object the GUID is attached to:  the preserved specimen
in a Museum; or the individual organism from what the specimen was
extracted.  If the former, they need separate GUIDs.  If the latter, they
may still need separate "Museum specimen" GUIDs (for tracking
museum-specimen-related stuff), but may also need to both point to a common
"individual organism" GUID and/or a common "collecting event" GUID.

Chuck Miller wrote:

> An issue that needs to be decided by the workshop is how much
> "abstraction" of a GUID is absolutely necessry if it must also be
> locatable through the Internet?  That is, is a compromise needed
> to allow embedding of domain names in order to enable use of the
> DNS to locate GUIDs.  Surely there is insufficient time, funds,
> and staff to embark upon creation of a master switchboard (database)
> where the locations of millions of GUIDs are recorded and updated
> in perpetuity.

I think we can have the best of both worlds, and I don't think we need a
master switchboard keeping track of the locations of millions of GUIDs.  If
GBIF or TDWG or some other similar organisation would simply provide a
service where anyone can self-register and reserve a block of 64-bit
integers, then those of us who wish to do so may substitute our own internal
ID numbers with corresponding values from the reserve block.  That way, we
can present them as part of an LSID, or handle system, or the terminal bit
of whatever GUID system is used, knowing with confidence that nobody else
out there will use the same integer for any ID value within the
TDWG/GBIF/bioinformatics universe.  The organization in charge of keeping
track of which clients reserved which blocks of numbers would *not* need to
track what data objects those individual numbers are assigned to.  In fact,
I suspect that the majority of those numbers would never get assigned to
anything.  But those that do get assigned, would be guaranteed to be unique
(by themselves) within the TDWG/GBIF/bioinformatics scope.  With that in
place, an LSID or DOI-like Handle GUID would serve as a contextual "wrapper"
around a core unique ID number (a number that could transcend any particular
GUID resolution system).

Donald Hobern wrote:

...a very helpful and thoughtfully-framed post, which I will think about
during my plane ride tonight.

See you all in a couple days....

Aloha,
Rich