What do we mean by GUID?

Nozomi Ytow nozomi at BIOL.TSUKUBA.AC.JP
Thu Oct 13 03:00:53 CEST 2005

Matt Jones wrote:

> I think you may be taking a more literal definition of the word
> 'identifier' than is often used.

I don't think so.  Our difference comes from scope of 'identity'.
I prefer to use equivalence of data rather than identity of objects
when we talking about proxy of data object.  This distinction
between equivalence and identity is essential to manage versioning
(in philosophical term, time).

> That they call the surrogate an 'identifier' does not imply that it is
> equivalent to the whole tuple, only that it is a proxy or surrogate for
> the tuple.  I believe this is the sense in which we have been using the
> term 'identifier' in these GUID discussions.

Does it mean that key-by-key instead of bit-for-bit equivalence?
It would be what is called as indiscernibilty in rough set theory.

> By definition, the provider should have no capability to change the bits
> behind an identifer -- this eliminates the persistence and replicability
> properties.  Any changes or corrections to the bits must be at least
> accompanied by a version increment in the ID or we lose all of the
> advantages of having the ID.

The GUID (including version) is unnecessary to be issued until someone
outside of the provider to access to the data.  However, it clears
what you meant by valid.  I would say 'unique representation' instead
of 'vlaid interpretation'.

> I need the GUID because I should not have to download a 1GB
> hyperspectral image to compare it against my own collection to see if I
> already have it.

If size is an issue, a (good) hash code would work as a contents proxy.

> Yes, it is assigned to a data object.  However, different applications
> that I can conceive would want to use differently grained objects.

Ah!  It is why you said 'interpretation'.  I don't think GUID is a
practical idea if we require to assign it to different representations
from the same set of data; here set impilies that the data object
transferred to portal may be generated from multiple data objects in
provider.  It may be feasible if we restrict to limited number
representations, i.e. ABCDC, TCS.

> > It may be better to use other words such as globally disambiguateor
> > or distinguisher, because we do not mean identity by identifier.
> I think we do mean identity.  The identifier is a label that can be used
> as a handle to reference the object.

Suppose you have two data objects having the same GUID.  Their (key)
contens may be identical (I'd like to say equivalent to avoid
confusion), but they don't have the same identity.  Talking identity
of two or more objects is illogical.  If there are two ore more, then
they have different identity.  If it is identity, then there is only
one thing.  I would say, if two data objects have the same GUID,
they should be equivalent (or, key contents should be identical).
'Equlvalent' may imply 'interpretation' also.

Dr. Nozomi "James" Ytow
Institute of Biological Sciences / Gene research center
University of Tsukuba
Tsukuba, Ibaraki 305-8572

More information about the tdwg-tag mailing list