[tdwg-guid] First step in implementing LSIDs
Richard Pyle
deepreef at bishopmuseum.org
Wed Jun 13 01:28:41 CEST 2007
Thanks, Jerry -- that's pretty much what I figured, and seems to make
perfect sense to me. In my own case, as I am now in the process of
completely overhauling our internal database structure and table
cross-linking, I have arrived at essentially the same conclusion. For what
it's worth, I have one sequentially-generated integer series that serves the
purpose of providing primary key values for essentially every table in our
system (only exceptions being a few enumerations with unambiguously finite
value sets). Like you, I am now thinking of maintaining those integers for
internal (performance) purposes only, and restricting what I expose publicly
to GUIDs (I'm leaning toward following your lead of using UUIDs for this).
Aloha,
Rich
> -----Original Message-----
> From: Jerry Cooper [mailto:cooperj at landcareresearch.co.nz]
> Sent: Tuesday, June 12, 2007 1:20 PM
> To: Richard Pyle; 'Bob Morris'; Kevin Richards
> Cc: 'Gregor Hagedorn'; tdwg-guid at lists.tdwg.org
> Subject: RE: [tdwg-guid] First step in implementing LSIDs
>
> Richard,
>
> In your analysis you had ...
>
> [GloballyUniquePrefixStuff]+[LocallyUniqueIDentifier]
>
> It is probably worth saying why our LUID is a semantically
> opaque GUID. And, yes we do consciously realise that we are
> 'hedging our bets' as you put it.
>
> In our internal databases over the last few years we have
> gone through various mechanisms for providing primary
> key/foreign relationships. I recall numerous discussions
> around the practical utility of counters versus semantically
> opaque unique keys. Ultimately we decided that data we want
> to make publicly available should carry a GUID - even if it
> isn't used internally as part of a table relationship PK/FK.
> What you see on the end of our LSIDs are those public GUIDs.
> We did that prior to TDWG discussions around the adoption of
> LSIDs and GUID resolution mechanisms. It allowed us to get on
> with the job!
>
> Jerry
>
>
> >>> "Richard Pyle" <deepreef at bishopmuseum.org> 13/06/2007
> 10:46:18 a.m.
> >>> >>>
>
> At one point last week I had visions on catching up on this
> whole thread and commenting on all sorts of things -- but I
> don't think there is any hope of that in the forseeable
> future, so I'll just jump right in on the current discussion.
>
> It seems to me that a lot of the recent discussion has been
> confused by an unclear disctinction between the needs of
> GUIDs as identifiers, and the means to resolve those GUIDs to
> data and metadata.
>
> As someone earlier pointed out, if all we need are
> identifiers (without immediate concern for the resolution
> mechanism), then UUIDs will suffice.
> So, let's start there:
>
> 0F9D60F1-59C7-495D-A37E-09C23770DD18
>
> Wonderful identifier, with utterly no information about what
> it represents, or what it would resolve to. When I type that
> text string into a web browser, I get nothing. So, we can
> either always rely on the GUID existing in some
> non-GUID-embedded context where the resolution mechanism is
> self-evident (maintained by the provider or the consumer of
> the GUID), or we can embed resolution "meta-information"
> within the GUID itself. These actually aren't fundamentally
> different, but for the sake of argument, let's assume that
> the former is unacceptable for our purposes. I can think of
> at least four obvious examples of the latter:
>
> DOI:10.1234/0F9D60F1-59C7-495D-A37E-09C23770DD18
>
> hdl:1234/0F9D60F1-59C7-495D-A37E-09C23770DD18
>
> URN:LSID:landcareresearch.co.nz:Names:0F9D60F1-59C7-495D-A37E-
> 09C23770DD18
>
> http://lsid.landcareresearch.co.nz/guid/0F9D60F1-59C7-495D-A37
> E-09C23770DD18
>
> (Note that except for the LSID, these are my own creations.)
>
> The first example (DOI) doesn't actually solve the problem,
> because merely appending "DOI:10.1234/" in front of my UUID
> does not, by itself make it resolvable. For example, if I
> type Rod's earlier example of
> "doi:10.1206/0003-0082(2005)485[0001:PNAICA]2.0.CO;2" into my
> browser, I get squat. That means that either I (as a human),
> or the software application that I write, needs some "insider
> information" to resolve the GUID. I don't see how this is any
> different from option numbers 2 or 3. Sure, you could argue
> that there is only one DOI system, or only one Handle system
> (in the same general sense that there is only one DNS
> system), so the amount of information required to resolve the
> GUID is relatively small and universally known. But it seems
> to me this is mostly true of LSIDs as well. In fact, if I
> simply install the LSID Launchpad plug-in for my browser, I
> *can* type the LSID in and get metadata back automatically
> (as long as I chop the "URN:" part -- which seems like an
> unnecessary step). The main difference among the first three
> that I see is that in this blink-of-an-eye-moment in history,
> there happen to be a lot of people using DOIs (and Handles?),
> whereas LSIDs have not (yet?) caught on as widely.
>
> So, that leaves the fourth option (HTTP URL): the obvious
> advantage of which is that resolution is a snap using current
> browser protocols and technologies (no plug-ins required),
> and development support is much better (given how widely used
> HTTP is); and the apparent disadvantages being some social
> concern of link-rot and/or uncertainty about the permanency
> of the owner of the domain.
>
> Most of this has already been addressed in this thread. What
> I haven't seen addressed (maybe I missed it?) is the
> decreasing opacity of the GUID as you go through the sequence
> above from UUID to URL. The decreasing opacity, as far as I
> understand it, is directly tied to the increasing reliance on
> embedded "meta-information" within the GUID itself to allow
> the GUID to be self-resolving.
>
> We all know that the value of a GUID is a function of its
> permanency, and we all agree that the weakest link in the
> chain of permanency is the social contract part. To me, one
> of the greatest values of maintaining opacity of GUIDs is to
> help facilitate (encourage) the social contract for permanency.
> Or, put another way, to decrease the opportunity/temptation
> for breaking the social contract for permanency.
>
> Using the examples above, if the domain name
> "landcareresearch.co.nz" is ever abandoned or changed or
> otherwise broken, the URL will die, but the LSID will
> continue to function (if I understand the LSID protocol
> correctly). In the case of HTTP, the domain name relies on
> current DNS mapping, whereas the LSID does not.
>
> So, my overall point here is that we seem to have two topics
> that are being
> conflated: GUIDs as identifiers per se, and resolution
> mechanisms for the GUIDs. The more reliably and simply the
> GUIDs are in terms of being self-resolving, the less opaque
> the become, and the more concern people have (justifiably or
> not) that permanency will be jeopardized.
>
> This leads me to the second part of this post, which I've
> already hinted at earlier.
>
> Most GUID schemes seem to follow the basic pattern of:
>
> [GloballyUniquePrefixStuff]+[LocallyUniqueIDentifier]
>
> Generally, the "GUPS" part is somehow attached to the issuer,
> and the "LUID"
> part is unique only within the context of the "GUPS". Also,
> any self-embedded resolution information is contained within
> the "GUPS".
>
> In the example at the beginning of this message, I started
> with a UUID, which by itself is globally unique. I carried
> this through to to other examples, such that the "LUID"
> portion of each example was actually by itself a GUID.
> Obviously, there is no guarantee of that for the "LUID" part
> of DOIs, Handles, LSIDs or URLs...but I keep asking myself
> why we as a community (i.e., TDWG) don't come out with some
> sort of "best practice" (if not recommendation, or even
> outright standard) that those of us who have not yet begun to
> issue GUIDs (but soon will) all make an effort to use
> something like UUIDs for the "object identifier" ("LUID")
> portion of the GUIDS we issue.
>
> My original thought was to register a Handle prefix to modify
> my LUID, such that "987654321" becomes "1234/987654321",
> which could then become
> "URN:LSID:bishopmuseum.org:1234:987654321" (or perhaps
> "URN:LSID:bishopmuseum.org:guid:1234/987654321", or even
> "URN:LSID:bishopmuseum.org:Names:1234/987654321"), which
> could then become
> "http://guid.bishopmuseum.org/?URN:LSID:bishopmuseum.org:1234:
> 987654321" (or maybe just
> "http://guid.bishopmuseum.org/?1234:987654321").
>
> But now I'm thinking maybe it's best to abandon the Handle
> part, and just issue UUIDs as my local identifiers, which can
> then be embedded in as many other GUID-resolution layers as I
> wish. This seems to be more or less what Kevin Richards has
> done for his LSIDs converted to URLs:
>
> http://lsid.landcareresearch.co.nz/lsid/URN:LSID:landcareresea
> rch.co.nz:Name
> s:0F9D60F1-59C7-495D-A37E-09C23770DD18
>
> In any case, my point is that it seems to make sense to me
> that we can all hedge our bets on which resolution mechanism
> emerges victoriously by starting off with a resolution-less
> UUID (or some other GUID serving a pure "identifier" role) at
> the core of our locally-issued GUIDs, and then represent
> those core identifiers through different resolution-level
> GUIDs however we see fit.
>
> I would very-much appreciate it if someone could explain to
> me what I am missing.
>
> Aloha,
> Rich
>
>
> > -----Original Message-----
> > From: tdwg-guid-bounces at lists.tdwg.org
> > [mailto:tdwg-guid-bounces at lists.tdwg.org] On Behalf Of Bob Morris
> > Sent: Tuesday, June 12, 2007 11:02 AM
> > To: Kevin Richards
> > Cc: tdwg-guid at lists.tdwg.org; Gregor Hagedorn
> > Subject: Re: [tdwg-guid] First step in implementing LSIDs
> >
> > Is the http proxy a GUID?
> >
> > What vouches for an assertion that a given proxy actually
> resolves the
> > associated LSID?
> >
> >
> >
> > On 6/12/07, Kevin Richards <RichardsK at landcareresearch.co.nz> wrote:
> > >
> > >
> > > The main use of the LSID http proxy would be when you have an RDF
> > > document or a triple store full of data that has BOTH the
> > LSID and the
> > > http version (as described on the wiki page
> > >
> > http://wiki.tdwg.org/twiki/bin/view/GUID/LsidHttpProxyUsageRec
> > ommendation).
> > >
> > > If ALL you had was the LSID
> > >
> > "URN:LSID:landcareresearch.co.nz:Names:0F9D60F1-59C7-495D-A37E
> > -09C23770DD18"
> > > and no other data at all, and you want to resolve it
> using the http
> > > proxy, then yes you are a bit stuck (as far as I know).
> > Unless we set
> > > up an LSID http proxy repository that can be queried and
> > returns the
> > > http proxy url for an LSID? Or you could just use the
> "hard coded"
> > > http proxy resolver at http:/lsid.tdwg.org/[lsid].
> > >
> > > Kevin
> > >
> > > >>> "Gregor Hagedorn" <G.Hagedorn at BBA.DE> 12/06/2007
> > 7:17:35 p.m. >>>
> > >
> > > > For those wanting another LSID http proxy example, I
> have changed
> > > > our LSID resolver here at Landcare to serve up the proxy
> > compliant RDF.
> > > >
> > > > Eg
> > > >
> > >
> >
> http://lsid.landcareresearch.co.nz/lsid/URN:LSID:landcareresearch.co.n
> > > z:Names:0F9D60F1-59C7-495D-A37E-09C23770DD18
> > > >
> > > > returns the metadata for the lsid
> > > >
> > >
> >
> URN:LSID:landcareresearch.co.nz:Names:0F9D60F1-59C7-495D-A37E-09C23770
> > > DD18
> > > >
> > > > I took me about an hour to change the RDF generator and setup a
> > > > redirection web directory on our web server (Microsoft IIS).
> > >
> > > I have an object with
> > >
> > >
> >
> URN:LSID:landcareresearch.co.nz:Names:0F9D60F1-59C7-495D-A37E-09C23770
> > > DD18
> > >
> > > I think the problem is that, without me telling my software
> > something
> > > I have gathered from this email, my software has no means to know
> > > about what you describe as easy.
> > >
> > > Unless it has an LSID resolver, in which case it would
> not need the
> > > http method.
> > >
> > > This is what the proposal to always use alternating http and LSID
> > > guids in any object we communicate about is saying. Whenever you
> > > publish
> > >
> >
> URN:LSID:landcareresearch.co.nz:Names:0F9D60F1-59C7-495D-A37E-09C23770
> > > DD18
> > > you
> > > also have to provide the http version of it.
> > >
> > > Gregor----------------------------------------------------------
> > > Gregor Hagedorn (G.Hagedorn at bba.de)
> > > Institute for Plant Virology, Microbiology, and Biosafety Federal
> > > Research Center for Agriculture and Forestry (BBA)
> > > Königin-Luise-Str. 19 Tel: +49-30-8304-2220
> > > 14195 Berlin, Germany Fax: +49-30-8304-2203
> > >
> > > _______________________________________________
> > > tdwg-guid mailing list
> > > tdwg-guid at lists.tdwg.org
> > > http://lists.tdwg.org/mailman/listinfo/tdwg-guid
> > >
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++++++++
> > > WARNING: This email and any attachments may be
> > confidential and/or
> > > privileged. They are intended for the addressee only and
> > are not to be
> > > read, used, copied or disseminated by anyone receiving
> > them in error.
> > > If you are not the intended recipient, please notify the
> sender by
> > > return email and delete this message and any attachments.
> > >
> > > The views expressed in this email are those of the sender
> > and do not
> > > necessarily reflect the official views of Landcare Research.
> > >
> > > Landcare Research
> > > http://www.landcareresearch.co.nz
> > >
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++++++++
> > >
> > >
> > > _______________________________________________
> > > tdwg-guid mailing list
> > > tdwg-guid at lists.tdwg.org
> > > http://lists.tdwg.org/mailman/listinfo/tdwg-guid
> > >
> > >
> > _______________________________________________
> > tdwg-guid mailing list
> > tdwg-guid at lists.tdwg.org
> > http://lists.tdwg.org/mailman/listinfo/tdwg-guid
>
>
> _______________________________________________
> tdwg-guid mailing list
> tdwg-guid at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-guid
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ++++++++++++++
> WARNING: This email and any attachments may be confidential
> and/or privileged. They are intended for the addressee only
> and are not to be read, used, copied or disseminated by
> anyone receiving them in error. If you are not the intended
> recipient, please notify the sender by return email and
> delete this message and any attachments.
>
> The views expressed in this email are those of the sender and
> do not necessarily reflect the official views of Landcare Research.
>
> Landcare Research
> http://www.landcareresearch.co.nz
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ++++++++++++++
>
>
More information about the tdwg-tag
mailing list