[tdwg-guid] Is the http proxy of an LSID a GUID as well?
Hi Bob,
Personally, I would hold on to pure LSIDs and avoid using the http proxy as a GUID as much as possible, just to make sure my clients are robust enough. However, the semantics of the proxy software and the RDF returned when the identifier is resolved allow clients to assume the http proxy is in fact a GUID.
Below is an example of RDF that you get back when you resolve the proxy LSID.
rdf:RDF <rdf:Description rdf:about="urn:lsid:ubio.org:namebank:11815"> dc:identifierurn:lsid:ubio.org:namebank:11815</dc:identifier> <owl:sameAs rdf:resource="http://lsid.tdwg.org/urn:lsid:ubio.org:namebank:11815" /> ...
In the case above, the LSID in its pure for is used as an identifier in *rdf:about* and *dc:identifier* (the namespace dc stands for Dublin Core.) However, the *owl:sameAs* statement asserts that both the LSID and its http proxy are equivalent and interchangeable. The proxy (the software at http://lsid.tdwg.org, for example) is the agent that actually implements that equivalence between the identifiers, without however asserting anything about resolvability of the proxied LSID. In my opinion, you can only assume that the proxy software will be able to resolve a given identifier as long as the proxy software itself and the LSID authority are operational. The proxy software cannot be deemed as an authority of that LSID. I think that this was your original concern, wasn't it?
In summary, there is no problem assuming that a proxy version of an LSID is a GUID. But there are operational implications of using either form. It will all depend on whether the client is aware of the LSID spec and its particular implementation. For example, if your client isn't aware of LSID spec, then you have no choice other than storing and using the proxy version to get back to the object associated with the ID. The client may use the pure LSID as a URI for internal calculations, such as graph merging and querying, but it won't be able to resolve it directly.
On the other hand, if your client is aware of the LSID spec, you have more options. You know how to get from an LSID to a proxy version and back. You can use it in your advantage, or not. You may want to store both versions so that your client is more robust to link rot, and so on.
Cheers,
Ricardo
Bob Morris wrote:
Is the http proxy a GUID?
Ricardo writes:
Personally, I would hold on to pure LSIDs and avoid using the http proxy as a GUID as much as possible, just to make sure my clients are robust enough. However, the semantics of the proxy software and the RDF returned when the identifier is resolved allow clients to assume the http proxy is in fact a GUID.
My concern is that the choice of LSIDs is based on too narrow a system view. The true system includes biologists, public administrators, other experts, and many kinds on non-LSID-aware software.
Scenario 1: Taxonomist publishes a paper in printed or online journal. Do I use the LSID, knowing no other biologists can interpret this, or do I use the http- GUID, which every biologist reading the paper can follow and retrieve some further information?
Scenario 2: I create a html-based web page for a species account. I could use the lsid as visible id, with the implications above. However, I would certainly like to support my users by linking to the resource and use the http as the link behind the text. From this moment on, for Google and all current other information indexers, the http-URL (which is always a GUID) will become the preferred ID under which this resource can be accessed.
What is the public usability of the GUID system that we plan?
---
The desired fall-back safety of Ricardo's scenario:
rdf:RDF <rdf:Description rdf:about="urn:lsid:ubio.org:namebank:11815"> dc:identifierurn:lsid:ubio.org:namebank:11815</dc:identifier> <owl:sameAs rdf:resource="http://lsid.tdwg.org/urn:lsid:ubio.org:namebank:11815" /> ...
can easily be achieved without the overhead of LSIDs, by using:
<rdf:Description rdf:about="http://lsid.ubio.org/namebank/11815%22%3E dc:identifierhttp://lsid.ubio.org/namebank/11815</dc:identifier> <owl:sameAs rdf:resource="http://fallbackresolver.bio- id.org/lsid.ubio.org/namebank/11815"/> ...
Several commentators have mentioned that http-GUIDs have a problem if the resource disappears. I believe this is only true with respect to resolving the GUID. The GUID functionality itself is unaffected if the DNS entry disappears. If ubio disappears, the GUID http://lsid.ubio.org/namebank/11815 can still be used for centuries. The service that resolves may already be published in RDF metadata (through SameAs), or it may be built into some systems.
*** Using something like "bio-id.org" (i.e. registering a new DNS, but running it through existing ifnrastructure (preferably GBIF) would improve future flexibility on who might pick it up, if something happens to tdwg or gbif. ***
Gregor---------------------------------------------------------- Gregor Hagedorn (G.Hagedorn@bba.de) Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Königin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
participants (2)
-
Gregor Hagedorn
-
Ricardo Pereira