How GUIDs will be used

Richard Pyle deepreef at BISHOPMUSEUM.ORG
Mon Jan 30 01:59:10 CET 2006


Hi Rod,

> To me centralisation is red rag to a bull, especially as the objects of
> interest (names and concepts) are things we might reasonably disagree
> over.

Please don't misunderstand what I'm talking about here. Of course we might
reasonably disagree over which names to regard as valid and which to regard
as synonyms.  We will also disagree about the scope of organisms to include
within the circumscription of a taxon concept.  However, in most cases, we
will not disagree that Smith (1955) described the species "bus", and placed
it in the genus "Aus" (i.e., the taxon name object "Aus bus Smith 1955"); or
that Jones (1975) regarded Smith's "Aus bus" as a junior synonym of Brown's
(1935) "Aus xus" (i.e., the taxon concept object "Aus xus Brown 1935 SEC
Jones 1975"; the circumscription of which includes the taxon concept object
"Aus bus Smith 1955 SEC Smith 1955").

Centralizing the issuance of GUIDs for things like taxon name objects and
concepts/usage instances does NOT, in any way, centralize "taxonomy".  It
simply serves to avoid issuing 150 different GUIDs for the taxon name object
"Aus bus Smith 1955" -- one GUID from each of 150 different data providers
that happen to list that name in their taxonomic authority table.

> Why not let users decide this, by which I mean, if a provider
> comes up with a comprehensive list of names with good supporting
> metadata, users will gravitate towards using them. There will also be a
> "market" for people building services that map between GUIDs (I'm
> thinking of making one for TreeBASE, for example). Why centralise this
> activity?

So that we don't need a "market" for services to cross-map duplicate GUIDs
that never needed to be created in the first place.  Instead, we should
"market" services that utilize a common/shared set of GUIDs for objective
name objects (and concept/usage objects) to assist *taxonomy*. (And, in the
shorter term, market tools that allow data providers to cross-map their
internal taxonomic authorities to shared GUIDs.)

We certainly can't eliminate duplicates, but at least we can try to minimize
the unnecessary duplicates. I spend an inordinate chunk of my time doing two
things that I should not have to do: 1) cross-mapping large datasets to a
common shared authority (like taxon names); and 2) cleaning up the database
messes created by earlier workers who were pressed for time, and opted for
the quick & dirty solution.

Frankly, I'm not sure why we even need GUIDs for things like Taxon Names,
other than to mitigate these two kinds of problems. I thought the point was
to facilitate electronic information flow.  How have we facilitated
electronic information flow if you assign one GUID for "Aus bus Smith 1955",
and I assign another GUID to the same taxon name, and a pair of human eyes
is required to ascertain that they are, indeed, two pointers to the same
abstract data object?

> I see the point that multiple GUIDs for the same thing can be
> a pain (for papers we have DOIs, PubMed ids, Google Scholar ids, DSpace
> handles, etc.), but in the end centralised GUID assignment reeks of
> committees, etc., in other words, impediments to actually getting
> things done.

Again, please do not confuse the idea of centralized (or at least
coordinated) issuance of GUIDs for unambiguously shared data objects (like
taxon name objects), with some sort of ill-advised centralized effort for a
"shared taxonomy".  I have not seen anybody in recent years even suggest the
possibility of the latter.

> I agree that software tools to "cross-walk multiple
> independent datasets with broadly overlapping data objects" would be
> very nice, but let's separate this from centralising GUID assignment.
> One of the lessons of the web, IMHO, is that centralisation doesn't
> scale.

You can't scale much bigger than the global pool of IP addresses which are,
ultimately, issued in blocks in a coordinated, semi-centralized way (not
althogether unlike a model of GUID issuance that I have previously
suggested).

Aloha,
Rich

Richard L. Pyle, PhD
Database Coordinator for Natural Sciences
Department of Natural Sciences, Bishop Museum
1525 Bernice St., Honolulu, HI 96817
Ph: (808)848-4115, Fax: (808)847-8252
email: deepreef at bishopmuseum.org
http://hbs.bishopmuseum.org/staff/pylerichard.html




More information about the tdwg-tag mailing list