Re: [tdwg-tag] Blog: UUIDs may be Dangerous
I have much sympathy for the concerns about LSIDs but continue to feel that it would prove a mistake for our community to adopt the kinds of URLs that are being suggested in this thread.
Kevin is quite correct that much of the reason for promoting LSIDs is a social issue that we want data providers to think hard before issuing LSIDs.
I worry however that, even if we deprecate URLs like http://www.my.org/entomology/tomcat/display.jsp?catno=12345 in favour of something more like http://my.org/stable-id/entomologycollection/12345, we are left with the following problems:
1. my.org still needs to step up to the long-term responsibility to handle URL rewriting for these ids.
2. my.org still needs to return data in a standard agreed form (e.g. RDF or OWL) when requested from these URLs.
3. Most important of all, our domain has a long history and information on specimens has value perhaps for centuries - why would we adopt a syntax for persistent long-term identifiers which is so strongly tied to the web and the web technologies of the very early 21st century?
Donald
Donald Hobern, Director, Atlas of Living Australia CSIRO Entomology, GPO Box 1700, Canberra, ACT 2601 Phone: (02) 62464352 Mobile: 0437990208 Email: Donald.Hobern@csiro.au
Well, this is all rather depressing -- we still haven't got the identifier problem sorted.
I worry however that, even if we deprecate URLs like http://www.my.org/entomology/tomcat/display.jsp?catno=12345 in favour of something more like http://my.org/stable-id/entomologycollection/12345 , we are left with the following problems:
- my.org still needs to step up to the long-term responsibility to
handle URL rewriting for these ids.
I doubt the majority of data sources are up to this, or anything like it. But this is easier than implementing LSIDs (by along shot).
- my.org still needs to return data in a standard agreed form (e.g.
RDF or OWL) when requested from these URLs.
Again, they barely manage to serve existing standards (such as Darwin Core) without messing up parts of it. Current RDF standards have yet to demonstrate any practical utility (I'm not saying that they aren't useful, just they haven't shown this yet).
- Most important of all, our domain has a long history and
information on specimens has value perhaps for centuries - why would we adopt a syntax for persistent long-term identifiers which is so strongly tied to the web and the web technologies of the very early 21st century?
Yes, but LSIDs use the DNS both as a means to help ensure uniqueness, and for resolution.
If we are genuinely concerned about longevity then why not look closely at systems explicitly designed to be independent of the web, such as Handles (and therefore, DOIs), and/or ARK? I find it interesting that this is the route taken by organisations that have a very real stake in the future (i.e., publishers). I don't get the sense that our field takes this particularly seriously (if we did we won't be in the current mess). Note also that the most successful identifiers in our field (GenBank accession numbers, and PubMed identifiers) are widely shared, reused, and have no web technologies embedded in them.
My sense is that if we continue to think in terms of widely distributed, independent data providers, with little centralisation, then URLs are the only feasible identifier (it's the only easy one to set up). We then have to live with fragility.
Or, somebody steps up and provides centralised resolution of unique identifiers, we get a workable system (a la CrossRef), and we start to make some progress.
It beggars belief that we are still stuck on this issue.
Regards
Rod
Donald
Donald Hobern, Director, Atlas of Living Australia CSIRO Entomology, GPO Box 1700, Canberra, ACT 2601 Phone: (02) 62464352 Mobile: 0437990208 Email: Donald.Hobern@csiro.au _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
--------------------------------------------------------- Roderic Page Professor of Taxonomy DEEB, FBLS Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
Donald writes:
I worry however that, even if we deprecate URLs like http://www.my.org/entomology/tomcat/display.jsp?catno=12345 in favour of something more like http://my.org/stable-id/entomologycollection/12345, we are left with the following problems:
- my.org still needs to step up to the long-term responsibility to handle URL rewriting for these ids.
- my.org still needs to return data in a standard agreed form (e.g. RDF or OWL) when requested from these URLs.
On 1: I think it would be a great community service, to provide
stable-id.gbif.net/my.org/entomologycollection/12345 for those having problems meeting the technical or management demands.
On 2: Why should we limit ourselves to web technologies of the very early 21st century?
Seriously, I believe the system of content negotation is much more flexible than the fixed LSID system (which though not fixed per se to RDF, requires a global agreement). With standard content negotiation web technology a provider can migrate to newer standards later on, but still honor requests for xml/rdf type. And without any effort, the same URL can provide human readable content in a format desired by the provider.
Gregor
participants (3)
-
Donald.Hobern@csiro.au
-
Gregor Hagedorn
-
Roderic Page