Dear Chris,
Damn, I was hoping nobody would ask about specifics as that would expose the fact that I've not thought much (any) of this through.
My instinct is that for those bits that will interact with existing publishers (e.g., will be cited in lists of references in papers), then DOIs are the logical choice. Hence, article and monograph-level would get DOIs. This would satisfy the desire to make taxonomic literature as visible to readers. It would also not require CrossRef to change any metadata that they serve.
For within-monograph (e.g., pages), then I guess the issue boils down to (a) cost, and (b) what metadata would be served? Here, I'd be happy with Handles (there's nothing stopping us having Handles for the articles and monographs as well as DOIs).
So, we could have
Article: DOI:xxxxxx HDL:yyyyyyyy
Monograph: DOI:xxxxxxx HDL:yyyyyyyyy p. 1 HDL:yyyyyyyy.1 p. 2 HDL:yyyyyyyy.2 . . p. n HDL:yyyyyyyy.n
In some ways it would be ideal to have a single GUID, but if DOIs are too costly then I still think we need DOIs for integration. I also wonder whether we need DOIs for page elements. When are they going to be used? I suspect, only in the context of (a) old literature being marked up, or new online taxonomic publications (say if BMC published a journal). In both cases, the underlying XML could be structured in such a way to recognise Handles.
Hope this makes sense. My instinct is that authors of a paper in Science, say, just want to cite a monograph or article, but somebody doing a taxonomic revision may want a much finder degree of citation, and in an ideal world they would be doing this electronically online.
Regards
Rod
On 20 Mar 2007, at 17:37, Chris Freeland wrote:
Rod,
Fascinating stuff, as always. DOIs have come up time and again in discussing the Biodiversity Heritage Library and I wondered if you had any thoughts about DOIs for historic literature. You were fairly adamant at the EoL meeting in Woods Hole in February that DOIs were critical for BHL, which we can all agree, but the unresolved question is at what level(s) we need DOIs. For serials it makes sense to assign at the article level, but what about as detailed as page-level DOI or as broad as title-level DOI? What then of monographs - would you want a DOI at the page for protologue linking? Looking forward to your thoughts,
Chris
Chris Freeland Application Development Manager Missouri Botanical Garden (314) 577-9548
-----Original Message----- From: tdwg-guid-bounces@lists.tdwg.org [mailto:tdwg-guid-bounces@lists.tdwg.org] On Behalf Of Roderic Page Sent: Tuesday, March 20, 2007 9:15 AM To: tdwg-tag@lists.tdwg.org; tdwg-guid@lists.tdwg.org Cc: Simon Rycroft; Vince Smith; David Remsen; Donat Agosti; William Piel Subject: [tdwg-guid] BioGUID
Dear All,
I've put together a web site called http://bioguid.info which, rather grandly, is an attempt to bootstrap the biodiversity Semantic Web by providing resolvable URIs for biological objects, such as publications,
taxonomic names, nucleotide sequences, and specimens.
These URIs (or "GUIDs") can be resolved by a web browser to display HTML, but under the hood are resolved to RDF (which you can see by viewing the source of the web page you get for a URI).
The web interface is really window dressing, I just wanted a way to display RDF that wouldn't frighten people (me included). For some URIs all I do is grab XML and reformat it (e.g., DOIs). For GenBank records,
all manner of agony is involved in trying to extract specimen and publication links.
A good place to get a sense of what bioguid.info is about is to start with this Pubmed record: http://bioguid.info/pmid:17079492
From this, you can get a list of sequences. If you click on one of those, you'll see a link to a specimen, which you can then look at (you'll need Firefox 1.5, Camino, Webkkit, or a browser with a SVG plugin for the full effect).
What I'm hoping to do is start doing things like taking a TreeBASE record, getting the linked sequences, running these through bioguid.info to extract georeferenced specimen links, so with minimal effort we get a map (and eventually a Google Earth tree a la Bill Piel).
The glue to make this happen comprises HTTP URIs that are dereferenceable to RDF. This is my mantra for the day.
Regards
Rod
Professor Roderic D. M. Page Editor, Systematic Biology DEEB, IBLS Graham Kerr Building University of Glasgow Glasgow G12 8QP United Kingdom
Phone: +44 141 330 4778 Fax: +44 141 330 2792 email: r.page@bio.gla.ac.uk web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html iChat: aim://rodpage1962 reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic Biologists Website: http://systematicbiology.org Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/ Find out what we know about a species: http://ispecies.org Rod's rants on phyloinformatics: http://iphylo.blogspot.com Rod's rants on ants: http://semant.blogspot.com
tdwg-guid mailing list tdwg-guid@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-guid
------------------------------------------------------------------------ ---------------------------------------- Professor Roderic D. M. Page Editor, Systematic Biology DEEB, IBLS Graham Kerr Building University of Glasgow Glasgow G12 8QP United Kingdom
Phone: +44 141 330 4778 Fax: +44 141 330 2792 email: r.page@bio.gla.ac.uk web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html iChat: aim://rodpage1962 reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic Biologists Website: http://systematicbiology.org Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/ Find out what we know about a species: http://ispecies.org Rod's rants on phyloinformatics: http://iphylo.blogspot.com Rod's rants on ants: http://semant.blogspot.com
Rod,
I'd just like to support your suggestion that DOIs are the obvious choice wherever we need a GUID for interacting with publishers or ensuring citability, and certainly for any elements which may require special handling (e.g. alternate views for subscribers and the general public).
In case there is any confusion over the outcomes of the TDWG workshops on GUIDs, the conclusion there was that different identifier models were likely to be appropriate in different situations. The general recommendation to adopt LSIDs was because they were lightweight at the time of issuing (no need to register them centrally) and could be resolved without a dependency on a single central service. At the same time, the choice of a URN-based identifier scheme rather than HTTP URIs still seems (at least to me) to be a benefit because we want to be able to assign identifiers which (at least in principle) are not tied to the current (albeit seemingly omnipresent) HTTP technologies - many of the objects we wish to identify have already had a valuable existence far longer than the Internet Age. In cases needing a more centralised and potentially robust solution, and where linkage to the publishing world is desired, DOIs are often likely to be the preferred choice.
Donald
On Mar 21, 2007, at 4:21 PM, Roderic Page wrote:
------------------------------------------------------------ Donald Hobern (dhobern@gbif.org) Deputy Director for Informatics Global Biodiversity Information Facility Secretariat Universitetsparken 15, DK-2100 Copenhagen, Denmark Tel: +45-35321483 Mobile: +45-28751483 Fax: +45-35321480 ------------------------------------------------------------
Dear Donald,
There is a tension here. URN-based identifiers (which include both LSIDs and DOIs) have the advantage of persisting independently of HTTP. One could argue that HTTP URIs could be treated in the same way (just strip off "http://"), but that might be thought of as cheating ;-)
Since I'm arguing for DOIs and Handles, I'm in favour of URNs. But we then have to be able to de-reference them, which today means that we need to be able to be treated as HTTP URIs, especially if we need to play ball with the the Semantic Web.
This is why I set up bioguid.info to play with things like DOIs. It acts as a proxy server that takes a URN and returns RDF (and tries to play by Semantic Web rules such as returning a 303 and redirecting to the RDF representation of an object).
My problem with LSIDs (and it kinda breaks my heart to say this given how much I've played with them) is that I don't see them offering any real advantages that outweigh the hassle of supporting their own resolution protocol, if what we ultimately need is HTTP URIs, then why go through the pain of LSID server software?
Regards
Rod
On 21 Mar 2007, at 15:38, Donald Hobern wrote:
Rod,
I'd just like to support your suggestion that DOIs are the obvious choice wherever we need a GUID for interacting with publishers or ensuring citability, and certainly for any elements which may require special handling (e.g. alternate views for subscribers and the general public).
In case there is any confusion over the outcomes of the TDWG workshops on GUIDs, the conclusion there was that different identifier models were likely to be appropriate in different situations. The general recommendation to adopt LSIDs was because they were lightweight at the time of issuing (no need to register them centrally) and could be resolved without a dependency on a single central service. At the same time, the choice of a URN-based identifier scheme rather than HTTP URIs still seems (at least to me) to be a benefit because we want to be able to assign identifiers which (at least in principle) are not tied to the current (albeit seemingly omnipresent) HTTP technologies - many of the objects we wish to identify have already had a valuable existence far longer than the Internet Age. In cases needing a more centralised and potentially robust solution, and where linkage to the publishing world is desired, DOIs are often likely to be the preferred choice.
Donald On Mar 21, 2007, at 4:21 PM, Roderic Page wrote:
Donald Hobern (dhobern@gbif.org) Deputy Director for Informatics Global Biodiversity Information Facility Secretariat Universitetsparken 15, DK-2100 Copenhagen, Denmark Tel: +45-35321483 Mobile: +45-28751483 Fax: +45-35321480
------------------------------------------------------------------------ ---------------------------------------- Professor Roderic D. M. Page Editor, Systematic Biology DEEB, IBLS Graham Kerr Building University of Glasgow Glasgow G12 8QP United Kingdom
Phone: +44 141 330 4778 Fax: +44 141 330 2792 email: r.page@bio.gla.ac.uk web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html iChat: aim://rodpage1962 reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic Biologists Website: http://systematicbiology.org Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/ Find out what we know about a species: http://ispecies.org Rod's rants on phyloinformatics: http://iphylo.blogspot.com Rod's rants on ants: http://semant.blogspot.com
Dear Rod,
I really do understand and sympathise with your position. I see the strengths of DOI for many purposes, and believe we would be crazy not to use them whenever we need an identifier for something that might need to be cited by publishers, etc. Handles may offer a good alternative, but I do wonder whether the long-term costs of maintaining a reliable and persistent Handle system will really be much less than what we might achieve if we negotiated a model appropriate to our needs with DOI.org.
To my mind, in many circumstances LSIDs do represent an appropiate mid-point between the centralised model of DOIs and the completely open world of HTTP URIs, although I do recognise that several of my reasons are more for reasons of sociology than technology. Here are the key points:
1. LSIDs do not require any central registration before they can be used - this is probably more appropriate than the DOI model for those cases in which we may wish to mint hundreds of thousands of identifiers and it is more important to be able to be globally unique than it is to be permanently resolvable.
2. LSIDs do not (in the current model) rely on the availability of any central service other than the basic infrastructure of the Internet.
3. LSIDs (unlike HTTP URIs) do not presuppose that the Internet we have known for the last 15-20 years will persist forever - and I regard this as important if we want to start building information tools to support a science with a long history like taxonomy/natural history into the future.
4. LSIDs (unlike HTTP URIs) require some thought and planning before they are allocated - a pure HTTP URI scheme will inevitably lead many to use their current web server structure as the basis for their identifiers.
It is however clear that we need to be able to support semantic web technologies today and that we do need to define some best practices for how and when to use an HTTP proxy server for LSIDs (or Handles or DOIs). I see this as sensible, provided we can find a way to preserve the identity of the naked identifier as being somehow the "real" GUID.
Thanks as ever for keeping the real issues to the fore.
Donald
On Mar 21, 2007, at 5:31 PM, Roderic Page wrote:
Dear Donald,
There is a tension here. URN-based identifiers (which include both LSIDs and DOIs) have the advantage of persisting independently of HTTP. One could argue that HTTP URIs could be treated in the same way (just strip off "http://"), but that might be thought of as cheating ;-)
Since I'm arguing for DOIs and Handles, I'm in favour of URNs. But we then have to be able to de-reference them, which today means that we need to be able to be treated as HTTP URIs, especially if we need to play ball with the the Semantic Web.
This is why I set up bioguid.info to play with things like DOIs. It acts as a proxy server that takes a URN and returns RDF (and tries to play by Semantic Web rules such as returning a 303 and redirecting to the RDF representation of an object).
My problem with LSIDs (and it kinda breaks my heart to say this given how much I've played with them) is that I don't see them offering any real advantages that outweigh the hassle of supporting their own resolution protocol, if what we ultimately need is HTTP URIs, then why go through the pain of LSID server software?
Regards
Rod
On 21 Mar 2007, at 15:38, Donald Hobern wrote:
Rod,
I'd just like to support your suggestion that DOIs are the obvious choice wherever we need a GUID for interacting with publishers or ensuring citability, and certainly for any elements which may require special handling (e.g. alternate views for subscribers and the general public).
In case there is any confusion over the outcomes of the TDWG workshops on GUIDs, the conclusion there was that different identifier models were likely to be appropriate in different situations. The general recommendation to adopt LSIDs was because they were lightweight at the time of issuing (no need to register them centrally) and could be resolved without a dependency on a single central service. At the same time, the choice of a URN- based identifier scheme rather than HTTP URIs still seems (at least to me) to be a benefit because we want to be able to assign identifiers which (at least in principle) are not tied to the current (albeit seemingly omnipresent) HTTP technologies - many of the objects we wish to identify have already had a valuable existence far longer than the Internet Age. In cases needing a more centralised and potentially robust solution, and where linkage to the publishing world is desired, DOIs are often likely to be the preferred choice.
Donald
On Mar 21, 2007, at 4:21 PM, Roderic Page wrote:
Donald Hobern (dhobern@gbif.org) Deputy Director for Informatics Global Biodiversity Information Facility Secretariat Universitetsparken 15, DK-2100 Copenhagen, Denmark Tel: +45-35321483 Mobile: +45-28751483 Fax: +45-35321480
Professor Roderic D. M. Page Editor, Systematic Biology DEEB, IBLS Graham Kerr Building University of Glasgow Glasgow G12 8QP United Kingdom
Phone: +44 141 330 4778 Fax: +44 141 330 2792 email: r.page@bio.gla.ac.uk web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html iChat: aim://rodpage1962 reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic Biologists Website: http://systematicbiology.org Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/ Find out what we know about a species: http://ispecies.org Rod's rants on phyloinformatics: http://iphylo.blogspot.com Rod's rants on ants: http://semant.blogspot.com
------------------------------------------------------------ Donald Hobern (dhobern@gbif.org) Deputy Director for Informatics Global Biodiversity Information Facility Secretariat Universitetsparken 15, DK-2100 Copenhagen, Denmark Tel: +45-35321483 Mobile: +45-28751483 Fax: +45-35321480 ------------------------------------------------------------
participants (2)
-
Donald Hobern
-
Roderic Page