I agree with Rich, there seem to be two conflated issues being cross-threaded in all these discussions. 1. An identifier that is simply globally unique - that is, the id is never duplicated and always refers to the same thing. So, you can use it as a unique reference in a paper, like an ISBN/ISSN number. But more importantly, it also can be used in data files/serialized XML to enable computers to quickly compare import/export records for merge/update, which is an important function to many, many biodiversity data projects. But, this id does not itself tell you where it can be found. Its location must come from another source. 2. An identifier that is globally locatable via the Internet - that is, the id is never duplicated and always locates the same thing (with a further definition needed of what the thing is). The globally locatable identifier needs to be locatable by a web browser (HTTP) but more importantly also by web services which may want to use a different protocol. 1 can be included in 2 as Nick has demonstrated. I think we need some standard use cases against which we could test the options. I'm afraid this discussion in the abstract will be endless. Do we have clear use cases for the biodiversity data GUIDs? How well do relocatable identifiers work in the use cases involving the exchange/compare/import/export of millions of records from multiple sources? Chuck -----Original Message----- From: tdwg-guid-bounces@lists.tdwg.org [mailto:tdwg-guid-bounces@lists.tdwg.org] On Behalf Of Richard Pyle Sent: Tuesday, June 12, 2007 5:46 PM To: 'Bob Morris'; 'Kevin Richards' Cc: tdwg-guid@lists.tdwg.org; 'Gregor Hagedorn' Subject: RE: [tdwg-guid] First step in implementing LSIDs At one point last week I had visions on catching up on this whole thread and commenting on all sorts of things -- but I don't think there is any hope of that in the forseeable future, so I'll just jump right in on the current discussion. It seems to me that a lot of the recent discussion has been confused by an unclear disctinction between the needs of GUIDs as identifiers, and the means to resolve those GUIDs to data and metadata. As someone earlier pointed out, if all we need are identifiers (without immediate concern for the resolution mechanism), then UUIDs will suffice. So, let's start there: 0F9D60F1-59C7-495D-A37E-09C23770DD18 Wonderful identifier, with utterly no information about what it represents, or what it would resolve to. When I type that text string into a web browser, I get nothing. So, we can either always rely on the GUID existing in some non-GUID-embedded context where the resolution mechanism is self-evident (maintained by the provider or the consumer of the GUID), or we can embed resolution "meta-information" within the GUID itself. These actually aren't fundamentally different, but for the sake of argument, let's assume that the former is unacceptable for our purposes. I can think of at least four obvious examples of the latter: DOI:10.1234/0F9D60F1-59C7-495D-A37E-09C23770DD18 hdl:1234/0F9D60F1-59C7-495D-A37E-09C23770DD18 URN:LSID:landcareresearch.co.nz:Names:0F9D60F1-59C7-495D-A37E-09C23770DD18 http://lsid.landcareresearch.co.nz/guid/0F9D60F1-59C7-495D-A37E-09C23770DD18 (Note that except for the LSID, these are my own creations.) The first example (DOI) doesn't actually solve the problem, because merely appending "DOI:10.1234/" in front of my UUID does not, by itself make it resolvable. For example, if I type Rod's earlier example of "doi:10.1206/0003-0082(2005)485[0001:PNAICA]2.0.CO;2" into my browser, I get squat. That means that either I (as a human), or the software application that I write, needs some "insider information" to resolve the GUID. I don't see how this is any different from option numbers 2 or 3. Sure, you could argue that there is only one DOI system, or only one Handle system (in the same general sense that there is only one DNS system), so the amount of information required to resolve the GUID is relatively small and universally known. But it seems to me this is mostly true of LSIDs as well. In fact, if I simply install the LSID Launchpad plug-in for my browser, I *can* type the LSID in and get metadata back automatically (as long as I chop the "URN:" part -- which seems like an unnecessary step). The main difference among the first three that I see is that in this blink-of-an-eye-moment in history, there happen to be a lot of people using DOIs (and Handles?), whereas LSIDs have not (yet?) caught on as widely. So, that leaves the fourth option (HTTP URL): the obvious advantage of which is that resolution is a snap using current browser protocols and technologies (no plug-ins required), and development support is much better (given how widely used HTTP is); and the apparent disadvantages being some social concern of link-rot and/or uncertainty about the permanency of the owner of the domain. Most of this has already been addressed in this thread. What I haven't seen addressed (maybe I missed it?) is the decreasing opacity of the GUID as you go through the sequence above from UUID to URL. The decreasing opacity, as far as I understand it, is directly tied to the increasing reliance on embedded "meta-information" within the GUID itself to allow the GUID to be self-resolving. We all know that the value of a GUID is a function of its permanency, and we all agree that the weakest link in the chain of permanency is the social contract part. To me, one of the greatest values of maintaining opacity of GUIDs is to help facilitate (encourage) the social contract for permanency. Or, put another way, to decrease the opportunity/temptation for breaking the social contract for permanency. Using the examples above, if the domain name "landcareresearch.co.nz" is ever abandoned or changed or otherwise broken, the URL will die, but the LSID will continue to function (if I understand the LSID protocol correctly). In the case of HTTP, the domain name relies on current DNS mapping, whereas the LSID does not. So, my overall point here is that we seem to have two topics that are being conflated: GUIDs as identifiers per se, and resolution mechanisms for the GUIDs. The more reliably and simply the GUIDs are in terms of being self-resolving, the less opaque the become, and the more concern people have (justifiably or not) that permanency will be jeopardized. This leads me to the second part of this post, which I've already hinted at earlier. Most GUID schemes seem to follow the basic pattern of: [GloballyUniquePrefixStuff]+[LocallyUniqueIDentifier] Generally, the "GUPS" part is somehow attached to the issuer, and the "LUID" part is unique only within the context of the "GUPS". Also, any self-embedded resolution information is contained within the "GUPS". In the example at the beginning of this message, I started with a UUID, which by itself is globally unique. I carried this through to to other examples, such that the "LUID" portion of each example was actually by itself a GUID. Obviously, there is no guarantee of that for the "LUID" part of DOIs, Handles, LSIDs or URLs...but I keep asking myself why we as a community (i.e., TDWG) don't come out with some sort of "best practice" (if not recommendation, or even outright standard) that those of us who have not yet begun to issue GUIDs (but soon will) all make an effort to use something like UUIDs for the "object identifier" ("LUID") portion of the GUIDS we issue. My original thought was to register a Handle prefix to modify my LUID, such that "987654321" becomes "1234/987654321", which could then become "URN:LSID:bishopmuseum.org:1234:987654321" (or perhaps "URN:LSID:bishopmuseum.org:guid:1234/987654321", or even "URN:LSID:bishopmuseum.org:Names:1234/987654321"), which could then become "http://guid.bishopmuseum.org/?URN:LSID:bishopmuseum.org:1234:987654321" (or maybe just "http://guid.bishopmuseum.org/?1234:987654321"). But now I'm thinking maybe it's best to abandon the Handle part, and just issue UUIDs as my local identifiers, which can then be embedded in as many other GUID-resolution layers as I wish. This seems to be more or less what Kevin Richards has done for his LSIDs converted to URLs: http://lsid.landcareresearch.co.nz/lsid/URN:LSID:landcareresearch.co.nz:Name s:0F9D60F1-59C7-495D-A37E-09C23770DD18 In any case, my point is that it seems to make sense to me that we can all hedge our bets on which resolution mechanism emerges victoriously by starting off with a resolution-less UUID (or some other GUID serving a pure "identifier" role) at the core of our locally-issued GUIDs, and then represent those core identifiers through different resolution-level GUIDs however we see fit. I would very-much appreciate it if someone could explain to me what I am missing. Aloha, Rich
-----Original Message----- From: tdwg-guid-bounces@lists.tdwg.org [mailto:tdwg-guid-bounces@lists.tdwg.org] On Behalf Of Bob Morris Sent: Tuesday, June 12, 2007 11:02 AM To: Kevin Richards Cc: tdwg-guid@lists.tdwg.org; Gregor Hagedorn Subject: Re: [tdwg-guid] First step in implementing LSIDs
Is the http proxy a GUID?
What vouches for an assertion that a given proxy actually resolves the associated LSID?
On 6/12/07, Kevin Richards <RichardsK@landcareresearch.co.nz> wrote:
The main use of the LSID http proxy would be when you have an RDF document or a triple store full of data that has BOTH the
LSID and the
http version (as described on the wiki page
http://wiki.tdwg.org/twiki/bin/view/GUID/LsidHttpProxyUsageRec ommendation).
If ALL you had was the LSID
"URN:LSID:landcareresearch.co.nz:Names:0F9D60F1-59C7-495D-A37E -09C23770DD18"
and no other data at all, and you want to resolve it using the http proxy, then yes you are a bit stuck (as far as I know). Unless we set up an LSID http proxy repository that can be queried and returns the http proxy url for an LSID? Or you could just use the "hard coded" http proxy resolver at http:/lsid.tdwg.org/[lsid].
Kevin
"Gregor Hagedorn" <G.Hagedorn@BBA.DE> 12/06/2007 7:17:35 p.m. >>>
For those wanting another LSID http proxy example, I have changed our LSID resolver here at Landcare to serve up the proxy compliant RDF.
Eg
http://lsid.landcareresearch.co.nz/lsid/URN:LSID:landcareresearch.co.n
z:Names:0F9D60F1-59C7-495D-A37E-09C23770DD18
returns the metadata for the lsid
URN:LSID:landcareresearch.co.nz:Names:0F9D60F1-59C7-495D-A37E-09C23770
DD18
I took me about an hour to change the RDF generator and setup a redirection web directory on our web server (Microsoft IIS).
I have an object with
URN:LSID:landcareresearch.co.nz:Names:0F9D60F1-59C7-495D-A37E-09C23770
DD18
I think the problem is that, without me telling my software something I have gathered from this email, my software has no means to know about what you describe as easy.
Unless it has an LSID resolver, in which case it would not need the http method.
This is what the proposal to always use alternating http and LSID guids in any object we communicate about is saying. Whenever you publish
URN:LSID:landcareresearch.co.nz:Names:0F9D60F1-59C7-495D-A37E-09C23770
DD18 you also have to provide the http version of it.
Gregor---------------------------------------------------------- Gregor Hagedorn (G.Hagedorn@bba.de) Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Königin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
_______________________________________________ tdwg-guid mailing list tdwg-guid@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-guid
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++
WARNING: This email and any attachments may be confidential and/or privileged. They are intended for the addressee only and are not to be read, used, copied or disseminated by anyone receiving them in error. If you are not the intended recipient, please notify the sender by return email and delete this message and any attachments.
The views expressed in this email are those of the sender and do not necessarily reflect the official views of Landcare Research.
Landcare Research http://www.landcareresearch.co.nz
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++
_______________________________________________ tdwg-guid mailing list tdwg-guid@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-guid
_______________________________________________ tdwg-guid mailing list tdwg-guid@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-guid
_______________________________________________ tdwg-guid mailing list tdwg-guid@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-guid