[tdwg-content] GUIDs for publications (usages and names)

Wed Jan 5 09:17:22 CET 2011

Few quick comments on this thread.

DOIs
----

CrossRef provides an OpenURL resolver for discovering whether a DOI  
exists for a publication. This is how publishers find DOIs for  
references cited in manuscripts, and several bibliographic tools use  
it to populate their databases (e.g., Zotero, Mendeley). I provide a  
wrapper to this resolver at http://bioguid.info/openurl.

The number of taxonomic works with DOIs is growing, and may be rather  
larger than most people realise, especially as publishers such as  
Wiley's digitise archives of society journals, and as JSTOR increases  
its content. Add yes, CSIRO uses DOIs.

The "brief citations" Tony mentions, which are widely used by  
nomenclators, do pose problems for services such as CrossRef. I  
discuss one approach to solving this in my paper "bioGUID: resolving,  
discovering, and minting identifiers for biodiversity informatics" (http://dx.doi.org/10.1186/1471-2105-10-S14-S5 
  ), where I describe a service that takes a page within a reference  
and tries to locate the DOI for the enclosing paper. I'm using this  
service at the moment to find papers in the recently released Plant  
List.

OCLC
----

Paul mentions http://www.worldcat.org, which is a great resource,  
albeit riddled with duplicates (although the library world has its own  
notion of what counts as a duplicate). BHL content is now appearing in  
OCLC.

CiteBank
--------

As Steve points out, we're going to need a large community effort to  
assemble a bibliography of all taxonomic work. Without wishing to side  
track this thread, I think this is a task best undertaken with a much  
larger community than we traditionally think of. The bibliographic  
site Mendeley or something similar, such as Zotero, seem better bets  
than doing this ourselves.

Linked Data
-----------

Without wishing to get too bogged down in a discussion of linked data,  
perhaps a few points are worth noting.

1. Although DOIs currently don't play nice with linked data (unless  
wrapped in a service such as http://bioguid), according to Geoff  
Bilder, CrossRef has plans to add content negotiation and RDF to http://dx.doi.org 
, and is close to flicking the switch to make this happen.

2. The bulk of linked data resources for biology are not provided by  
the primary data sources (e.g., GenBank or PubMed) but by third party  
wrappers around those services. These services have made choices about  
URIs, vocabularies that might not hold if and when the primary data  
providers, such as NCBI, start serving RDF. In my experience these  
third party services may also omit things that we may think of as  
vital, for example http://bio2rdf.org doesn't include specimen or  
locality information for GenBank sequences.

3. For many resources we either have no linked data URIs or too many.  
This either leads to somebody having to mint URIs, typically for  
something that isn't their own data (e.g., I provide HTTP URIs for  
ISSNs), or having to accommodate multiple URIs, which makes specifying  
relationships between objects rather tricky (one database may describe  
a citation link between two documents using PubMed identifiers, while  
another describes the same link using DOIs).

My point is that linked data isn't a panacea, in some respects it just  
makes the mess we're in easier to see. The hope of the linked data  
community is that eventually it will all come together. For an  
alternative take on this I recommend Stefano Mazzocchi's essay "On  
Data Reconciliation Strategies and Their Impact on the Web of Data" http://www.betaversion.org/~stefano/linotype/news/304/

Regards

Rod

---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK

Email: r.page at bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792
AIM: rodpage1962 at aim.com
Facebook: http://www.facebook.com/profile.php?id=1112517192
Twitter: http://twitter.com/rdmpage
Blog: http://iphylo.blogspot.com
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html