[tdwg-content] GUIDs for publications (usages and names)
r.page at bio.gla.ac.uk
Wed Jan 5 09:17:22 CET 2011
Few quick comments on this thread.
CrossRef provides an OpenURL resolver for discovering whether a DOI
exists for a publication. This is how publishers find DOIs for
references cited in manuscripts, and several bibliographic tools use
it to populate their databases (e.g., Zotero, Mendeley). I provide a
wrapper to this resolver at http://bioguid.info/openurl.
The number of taxonomic works with DOIs is growing, and may be rather
larger than most people realise, especially as publishers such as
Wiley's digitise archives of society journals, and as JSTOR increases
its content. Add yes, CSIRO uses DOIs.
The "brief citations" Tony mentions, which are widely used by
nomenclators, do pose problems for services such as CrossRef. I
discuss one approach to solving this in my paper "bioGUID: resolving,
discovering, and minting identifiers for biodiversity informatics" (http://dx.doi.org/10.1186/1471-2105-10-S14-S5
), where I describe a service that takes a page within a reference
and tries to locate the DOI for the enclosing paper. I'm using this
service at the moment to find papers in the recently released Plant
Paul mentions http://www.worldcat.org, which is a great resource,
albeit riddled with duplicates (although the library world has its own
notion of what counts as a duplicate). BHL content is now appearing in
As Steve points out, we're going to need a large community effort to
assemble a bibliography of all taxonomic work. Without wishing to side
track this thread, I think this is a task best undertaken with a much
larger community than we traditionally think of. The bibliographic
site Mendeley or something similar, such as Zotero, seem better bets
than doing this ourselves.
Without wishing to get too bogged down in a discussion of linked data,
perhaps a few points are worth noting.
1. Although DOIs currently don't play nice with linked data (unless
wrapped in a service such as http://bioguid), according to Geoff
Bilder, CrossRef has plans to add content negotiation and RDF to http://dx.doi.org
, and is close to flicking the switch to make this happen.
2. The bulk of linked data resources for biology are not provided by
the primary data sources (e.g., GenBank or PubMed) but by third party
wrappers around those services. These services have made choices about
URIs, vocabularies that might not hold if and when the primary data
providers, such as NCBI, start serving RDF. In my experience these
third party services may also omit things that we may think of as
vital, for example http://bio2rdf.org doesn't include specimen or
locality information for GenBank sequences.
3. For many resources we either have no linked data URIs or too many.
This either leads to somebody having to mint URIs, typically for
something that isn't their own data (e.g., I provide HTTP URIs for
ISSNs), or having to accommodate multiple URIs, which makes specifying
relationships between objects rather tricky (one database may describe
a citation link between two documents using PubMed identifiers, while
another describes the same link using DOIs).
My point is that linked data isn't a panacea, in some respects it just
makes the mess we're in easier to see. The hope of the linked data
community is that eventually it will all come together. For an
alternative take on this I recommend Stefano Mazzocchi's essay "On
Data Reconciliation Strategies and Their Impact on the Web of Data" http://www.betaversion.org/~stefano/linotype/news/304/
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK
Email: r.page at bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792
AIM: rodpage1962 at aim.com
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
More information about the tdwg-content