Re: [tdwg-tag] SourceForge LSID project websites broken - role for TDWG? [SEC=UNCLASSIFIED]

8 Apr 2009

      PURL := Technology and likely beyond the reach of many data providers.  LSID is
probably the easy way out because they do not need to be resolved to be useful.
An unresolvable PURL is no PURL at all.  Either way depends on infrastructure.

Right now I am thinking, why not have both? Different clients,
different requirements, different solutions. We do this all the time.
Is there a reason why we should not alias PURL and LSID (or DOI)  as
identifiers for the one resource and join in both games?  PURLs from
providers already committed to LSID and using UUID local identifiers
may end up looking a little strange if not repetitive but there would
be little change in overhead for most providers. For most of us GUIDs
are for external consumption anyway.

For the infrastructure, which may be quite light weight, though heavy
social component, I have gleaned a list of candidate requirements from
the thread:

.   GUIDS are URIs
.   Objects may have more than one GUID (URI aliases)
.   GUIDS are Resolvable. At least for dereferencable URIs.
.   GUIDS resolve to RDF using standard vocabularies
.   Data Provider resolves GUID or delegates authority for this responsibility.
.   Authorities use sub-domains to simplify transfer of delegation
.   GUIDs := <uri
class>[:/+]<authority>[:/]<namespace>[:/]<uniqueLocalIdentifier>
.   GUID classes may be delegated independently
.   Delegate/Provider and GUID assignment must be registered
.   Delegates guarantee to serve provider meta data intact
.   Derivative objects must reference source URI
.   Derivative objects must not present as aliases.

I imagine existing aggregators stepping in to take on delegation of
resolution services and LSID proxies where required. Will they object?

greg

appropriate English idioms:  Cut the cloth ... silk purse ... casting purls ...

2009/4/8 Roger Hyam <rogerhyam@mac.com>:
...
General comments on decision making...
This is a technical discussion list. We are clever techie people.
Given a challenge of X resources and Y requirements we can come up
with a preferred list of solutions and probably implement one of them
with our eyes closed.
To use a fine English idiom "you cut your cloth to suit your purse".
We don't have a shared purse so arguments about how to cut our cloth
will never be resolved. X is undefined and I am not sure Y is that
well defined.
I know this is "chicken and egg" in that we need to come up with
requirements to request a purse  but we really need to have some
indication that some one will be willing to commit long term resources
to the common good before we can present a menu of choices for what
money could be spent on.
1.0 developers/year (in perpetuity) gets us a server or two managed to
support some kind of DNS based SRV hosting or redirect services or
Handle system with support for some library development and help desk
stuff.  (Note I am not talking servers or meetings or reports or
technology and I am talking commitment to pay people to have it as
their responsibility to maintain the system both socially and
technically - for the long term!!!).
Without the indication that some one (a consortium perhaps) is likely
to formally commit to a minimum of this level of resources we are
wasting our time talking about resolution mechanisms that are not DNS
based i.e. variations on the PURL model.
If we don't have the money to build a walled garden we have to graze
on the common with everyone else.
Roger
(BTW: I am not totally convinced that building a walled garden is the
way forward but would happily come and graze in it if some benefactor
would fund its perpetual maintenance).
On 8 Apr 2009, at 01:40, Donald.Hobern@csiro.au wrote:
...
A few comments on semantic opacity...
1. My examples ("urn:lsid:csiro.tdwg.org:anic:12345") were
deliberately transparent (or at least translucent) to make it easier
to follow the example, but I would have no real problem with them
having a form more like "urn:lsid:bio-id.org:9876:12345".
2. I think a single-minded drive towards semantic opacity would be
as quixotic and self-destructive as anything we could do.  UUIDs are
nicely opaque, and we could build a DOI-like system which maps
individual UUIDs to their current locations.  Such an approach would
be painful and an administrative nightmare.  I also suspect that
such opaque identifiers would be resisted by most users.  If we step
away from such a pure implementation, the alternatives all embed
some kind of semantic cues which make the system operate better.
The form of a DOI encodes relevant data on the source of the
object.  PURLs and LSIDs do the same.  The point with semantic
opacity in the LSID specification is that it is not possible for a
client to make inferences about the location of data based on the
subelements within the LSID.  It is up to the resolver
implementations to determine how to return the data.  Once this
point is accepted, I would in fact say that the presence of some
semantic clues within the identifier text is a good thing.  The
clues may for various reasons no longer conform to the reality of
how the metadata are managed, but a user may still rapidly glean
relevant indications whether an identifier is worth resolving (it
may indicate that it relates to a nomenclatural record, or that it,
at least originally, was minted by some respected source).  I see
such clues as having the same kind of value which has enabled
Linnaean nomenclature to persist so long.  My preference for LSIDs
would therefore be for them to be like the ones Roger minted for BCI.
3. I also note that this discussion has suggested remarkable near-
unanimity from many people in their distaste for LSIDs.  However I
fear that the level of agreement would be little higher if we were
discussing DOIs, or PURLs.  Some of the objections have been that
LSIDs do not fit well with the key technologies of the semantic web
and that something more like PURLs would be the right course to
follow.  Other objections have related to the semantic near-
transparency of many LSIDs or the absence of strongly centralised
support with the implication that something more like DOIs would be
better.  Both arguments have value, but they point in different
directions.  The various identifier schemes make up a landscape
within which no identifier scheme represents an adaptive peak in all
contexts.  We need to develop applicability statements for how to
use several of these schemes as alternatives for biodiversity data
and we need to identify the drivers which may guide different
providers to different schemes for different purposes.
Donald
Donald Hobern, Director, Atlas of Living Australia
CSIRO Entomology, GPO Box 1700, Canberra, ACT 2601
Phone: (02) 62464352 Mobile: 0437990208
Email: Donald.Hobern@csiro.au
Web: http://www.ala.org.au/
-----Original Message-----
From: Bob Morris [mailto:morris.bob@gmail.com]
Sent: Wednesday, 8 April 2009 2:04 AM
To: Roderic Page
Cc: Hobern, Donald (Entomology, Black Mountain); Roger Hyam; tdwg-tag@lists.tdwg.org
Subject: Re: [tdwg-tag] SourceForge LSID project websites broken -
role for TDWG?
A few non random comments on Rod's random comments on Donald's
proposal
On Tue, Apr 7, 2009 at 11:19 AM, Roderic Page <r.page@bio.gla.ac.uk>
wrote:
...
A few random comments:
Donald wrote:
...
InstitutionCode/CollectionCode/CatalogueNumber triple and to the
three main substitutable elements in an LSID.  Some systems such as
DOI may obscure the whoGeneratedTheData
Rod responded:
...
This assumes that it's good to have lots of metadata embedded in the
identifier. This level of "branding" might be fine for specimens
(assuming each data provider has the ability to serve their own
data),
but what about shared identifiers such as taxon names -- I suspect
having to "choose a brand" is going to be an obstacle to adoption for
just the identifiers that we most need to share. Identifiers such as
DOIs have less branding (although publishers have managed to attach
branding significant to the few digits after the "10." prefix).
Bob cites:
"LSIDs are intended to be semantically opaque, in that the LSID
assigned to a resource should not be counted on to describe the
characteristics or attributes of the resource that the LSID refers to.
The users of the LSIDs are permitted to use individual components (as
specified elsewhere in this document) of LSIDs - although the LSID
component parts themselves should be treated as opaque pieces of the
identifier." LSID spec, Section 8.
It's regrettable that the LSID spec is so poorly written that it
permits the useless term "should". Alas, I suppose that leaves room
for argument with my position that LSIDs with embedded metadata are
not LSIDs--they are something else based on the LSID syntax.  There's
nothing inherently wrong with, oh, say, a Handles implementation based
on prefacing LSID syntax with something controlled. See below.
Rod remarks:
...
Note also that DOIs (and Handles) can be queried for metadata, see
Tony Hammnd's OpenHandle project (http://www.crossref.org/CrossTech/2008/10/the_last_mile.html
 and http://code.google.com/p/openhandle/), so we don't need to embed
this in the actual identifier itself.
Bob replies
DOIs \are/ Handles. This is the (unstated?) reason that
http://wiki.tdwg.org/twiki/bin/view/GUID/TechnologyComparison is
filled with comparisons of the form  "DOI:   Same as Handles"
DOI is an implementation of Handles, with the additional treatment of
things about which Handles  is silent . See
http://www.doi.org/factsheets/DOIHandle.html  When I read that
document casually, I come to the initial conclusion that Donald's
proposal is essentially doing the same kind of extension to Handles
(possibly a Good Thing if correct), except for allowing metadata in
the identifier (yech!).
--Bob
--
Robert A. Morris
Professor of Computer Science
UMASS-Boston
ram@cs.umb.edu
http://bdei.cs.umb.edu/
http://www.cs.umb.edu/~ram
http://www.cs.umb.edu/~ram/calendar.html
phone (+1)617 287 6466
_______________________________________________
tdwg-tag mailing list
tdwg-tag@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-tag
_______________________________________________
tdwg-tag mailing list
tdwg-tag@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-tag
------
If you have received this transmission in error please notify us immediately by return e-mail and delete all copies. If this e-mail or any attachments have been sent to you in error, that error does not constitute waiver of any confidentiality, privilege or copyright in respect of information in the e-mail or attachments.
Please consider the environment before printing this email.
------
-- 
Greg Whitbread
Australian National Botanic Gardens
Australian National Herbarium
+61 2 62509482
ghw@anbg.gov.au

Re: [tdwg-tag] SourceForge LSID project websites broken - role for TDWG? [SEC=UNCLASSIFIED]

greg whitbread