I’ve added some starting comments on the
question of whether or not a “central LSID registration authority” would
be useful to our community:
http://wiki.gbif.org/guidwiki/wikka.php?wakka=GUIDCentralRegistrationAuthority
(text appended below)
This is one of the Infrastructure Working Group
topics.
As I have noted, there are topics which could be
addressed under this title but which I have deliberately excluded. Please
address these if you think they are valuable.
Please provide your input.
Thanks,
Donald
---------------------------------------------------------------
Donald Hobern (dhobern@gbif.org)
Programme Officer for Data Access and Database Interoperability
Global Biodiversity Information Facility Secretariat
Universitetsparken 15, DK-2100
Tel: +45-35321483
---------------------------------------------------------------
A central LSID registration authority
could take several different forms. The following is an attempt to outline some
options, so that subsequent discussion can focus more clearly on the benefits
of each.
I'm going to start by ignoring the
possibility of establishing a central LSID
registration authority as some kind of registrar approving or recording new
LSID authorities within our domain. I'm also going to ignore the possibility of
requiring all LSIDs to be registered within a single namespace. I cannot see
value in either of these "options". If anyone wishes to propose them,
please go ahead.
It is assumed here that the purpose of such an authority would be to provide a
mechanism for data providers to associate LSIDs with their records without
having to establish and maintain an LSID resolver in their own namespace. This
could for example be because the data set in question is likely to be moved to
a different home, or because the data provider cannot get permission to
register an LSID provider with DNS, or because the data provider's host
institution has restrictive rules on firewall access.
Under these circumstances a data provider may still wish to be able to assign
LSIDs to each data record and to have the LSID resolve correctly to the
appropriate data and metadata.
Some options:
The simplest
option technically (or at least simplest with respect to the assignment of
LSIDs) may be for the central LSID
registration authority actually to host the data sets in
question on behalf of the data provider.
If the data
provider is able to establish an LSID resolver but is unable or unwilling to
register this resolver with DNS, a central
LSID registration authority might establish a proxy LSID
resolver, which redirects to the known location of the actual LSID resolver. In
other words, assuming that the central LSID registration authority has an LSID
resolver registered for clra.org
and running on lsid.clra.org, and the data provider has established its own
unregistered LSID resolver on lsid.dp.org, the data provider can issue LSIDs of
the form urn:lsid:clra.org:<DatasetName>:<RecordId>. Requests for
data or metadata will be directed to lsid.clra.org, which resolves <DataSetName> to be associated with the
unregistered LSID resolver on lsid.dp.org, and directs the request to
lsid.dp.org for resolution.
This again should be relatively simple to implement, and could be of use in
some circumstances. The data provider and the central LSID registration
authority clearly need to coordinate the elements included in the LSID string.
Slightly more complex (and potentially with too many
uncertainties), a data provider could register a provider tool using some other
protocol (e.g. TAPIR) and rely on an external proxy LSID resolver to map LSID
requests to the appropriate protocol. For example, assume that a TAPIR provider
is set up on tapir.dp.org to serve data in some version of
This is again not complex to implement, but requires even more coordination
than with option 2. It could however be applicable in any case in which a data
provider has any mechanism for sharing their data online, regardless of what
additional firewall restrictions are in place.
This is doubtless not a complete set of options, but should be enough to begin
discussion. So, do you see value in considering any of these options as part of
the infrastructure we plan to adopt. It may of course be that any and all of
these can be used transparently within any LSID-based network, but we should
consider whether there are immediate benefits in offering some kind of central
service (through TDWG, GBIF or others) for these purposes.