Globally Unique Identifier - Part III

Richard Pyle deepreef at BISHOPMUSEUM.ORG
Fri Sep 24 09:30:56 CEST 2004


> > Agreed!  And that might be the better approach (for a lot of
> reasons).  My
> > only concern would be to what extent desktop database
> applications can uses
> > MAC values as primary keys (compared to using something like
> long integers)
> > efficiently and effectively, when manipulating large datasets
> in real time.
> > I guess this may be a trivial point in the grand scheme of things -- but
> > ultimately taxonomists will want to be able to work with large
> datasets on
> > their personal computers.
> >
>
> eh??? MAC addresses /are/ just long integers.

Sorry....I meant "Long" integer in the sense of DB applications, which are
32-bit (4 byte) integers.  I had read once that MAC-style GUIDs are not
optimized on desktop DB applications for use as primary keys.  However,
having just done some Google-snooping, I can't confirm that (in fact, if
anything, I've found the opposite).  So I'm starting to think that my
concerns about using about using MAC-style IDs as primary keys are probably
a Red Herring.

> Once reduced to a digital representation, /everything/ is just a long
> integer. Hence in the end, for machine use, the differences between
> different ID schemes come down to a very few criteria, all having to do
> with programming ease and the effective computability of each of the
> assertions about the acquisition of IDS and manipulations of them that
> are required.]. No two schemes are distinguishable solely on the basis
> that one of them appears to have a representation as integers and one of
> them appears not to.

Agreed -- but aren't DB applications optimized to utilize certain kinds of
ID schemes more effectively/efficiently than other kinds of ID schemes?

Many thanks for the enlightening (and interesting) insights on MAC IDs.  It
always amazes me that, though I may understand computers better than 95% of
people on Earth, the gap between me and the top 1% is much larger than the
gap between me and, say, the fish in my aquarium.

I think the important parts of this discussion surround the functional
parameters of the GUIDs for biological objects:

1) Should issuance of IDs be controlled from a single source; or freely
created by anyone with a computer, anywhere, anytime; or something
in-between?

2) Is it important that all biological objects use the same scheme for ID
sourcing, or is it advantageous to chose a scheme optimal for each class of
object (e.g., privately owned and managed specimen data, vs. publicly owned
and managed taxonomic nomenclature data)?

3) Should contextual information about how to resolve the ID be embedded
within the ID itself, or should the responsibility of context be relegated
to presentation protocols?

4) What is the greater risk to information flow in our context: reliance on
a single-point source for all the data resolution, or reliance on many
sources functioning simultaneously in order to resolve all of the data? What
options for mitigating impediments to data resolution are available to each
of these approaches?

Aloha,
Rich




More information about the tdwg-content mailing list