Re: Different reasons for different GUIDs (Was: GUIDs, LSIDs, and metadata)

13 Sep 2005

      Thanks, Donald.
...
There are different situations (use cases) in which we may require GUIDs.
The service characteristics of GUIDs may not always be the same.
I agree, and this is true at two levels.  At one level, the GUID needs may
be different for different data domains (i.e., top-level object categories:
Taxon Names, References, Specimens, Agents, etc.) At another level, the GUID
needs for different use cases *within* a data domain may differ.
...
3. Users can assume that data records relate to the same real-world object
if and only if they share the same GUID.
I think that we have generally assumed that all our GUID uses will require
this additional function, but this is the part that will get very
expensive to implement.
Why would it be so much more expensive to implement?
...
I believe that we should take a case-by-case decision on
whether the additional infrastructure is really required to support this
level of integration.
I'm not sure I follow.  Are you saying that it requires additional
infrastructure to support the notion that "Users can assume that data
records relate to the same real-world object if and only if they share the
same GUID"?  But under what circumstances could users ever assume that data
records relate to the same real-world object when they do *not* share the
same GUID?  It would seem to me that it would require a human brain to
reliably ascertain that GUID A refers to the same real-world object as GUID
B.  And once that determination is made, the "expensive" part is already
done. At that point, the GUIDs should be (objectively) "synonymized", so
that future users who reference either GUID do not have to invest the same
expensive decision-making process involved with ascertaining their
equivalency.

Maybe I've completely misunderstood your point, but I guess what I'm
struggling to understand is, when presented with two data objects with
different GUIDs, under what circumstances could a user "assume" anything
other than that they are not the same real-world object?
...
As an example, I can see excellent reasons for us at least to try for
benefit 3 for specimen records, and also probably in some way for taxon
names and publications (whatever actual form the identifiers for these may
take).  However (I think, like Rod) I am not totally convinced
that we will
need to manage GUIDs offering benefit 3 for taxon concepts.
O.K., now I think I understand where you are coming from (i.e., the "fuzzy"
nature of taxonomic concept GUIDs, and what they really represent).  As I
have argued during the TCS/LC discussions, my feeling is that a Taxonomic
Concept object should not receive a GUID that is disconnected from the
[TaxonName]+[Publication] pair of GUIDs. In other words, I think that a
Concept should be uniquely identified via a name-usage instance
(specifically, the name-usage instance that defined the concept). But I may
not be in the majority on this.
...
We should use
GUIDs (with benefits 1 and 2) for taxon concepts, and our taxon concepts
should reference names and publications which probably have the stronger
kind of GUIDs (with benefits 1-3).  The coupling of a name GUID and a
publication GUID would then be a unique combination that would still allow
concepts to be compared.
Exactly....so why assign a different GUID to the Concept?  Or, perhaps put a
different way, why assign a different GUID to the *name* (in the Taxonomer
paradigm that names do not exist as stand-alone objects independent of usage
instances)?
...
This is just an example, but indicates that we need to think hard
about the
use cases for each data object class and its associated GUIDs.
Agreed!

Aloha,
Rich

Richard L. Pyle, PhD
Ichthyology, Bishop Museum
1525 Bernice St., Honolulu, HI 96817
Ph: (808)848-4115, Fax: (808)847-8252
email: deepreef@bishopmuseum.org
http://hbs.bishopmuseum.org/staff/pylerichard.html

Richard Pyle

tags

participants (1)