[tdwg-tag] Specimen identifiers
r.page at bio.gla.ac.uk
Fri Feb 24 11:21:45 CET 2012
On 24 Feb 2012, at 08:53, Gregor Hagedorn wrote:
> On 23 February 2012 20:41, Kevin Richards
> <RichardsK at landcareresearch.co.nz> wrote:
>> Another problem is what the identifier refers to. As someone (I think Rich) said in a recent post, two different people may apply the same identifier to slightly different things - eg to the "name" of a person, or to the "person" itself. This is another barrier to reuse of shared identifiers.
> My understanding of the semantic web is actually that giving DIFFERENT
> identifiers (and as Rich says: "globally unique", "persistent", and
> "actionable") to things is a GOOD thing.
No, it's not ;)
> Depending on your purpose, two identifiers may or may not be the same.
> This is a standard problem, and I believe it is much easier solved if
> I have 2 identifiers and separate sameAs assertions. I can use the
> existing sameAs as my default, but can easily differ in opinion (by
> using contradictory sameAs - which may involve localizing and
> modifying the default sameAs, but at least it is solvable.)
> With regard to publications:
> Should a
> a) OpenAccess Preprint
> b) OpenAccess Postprint
> c) ClosedAccess Elsevier Journal
> (well, rare to find these together :-) )
> have the same ID or not?
> For many purposes they are sameAs, but not for all.
One identifier for the publication (the "thing"), multiple identifiers for the different representations if you want to be that granular, but these all link to the overall id. Otherwise we have a mess.
One of the most sensible things the science publishing industry did was assign DOIs to articles, not their individual representations (DOIs can support multiple resolutions, as can HTTP URIs through content negotiation). This means I can cite an article by linking to the DOI, and ignore what is for most purposes irrelevant (the representation). The citation network would be a hellish mess without this simplification.
Contrast this with ISSNs, which are different depending on the representation (print or electronic). Result - mess, it's not clear what the unique identifier for a journal should be, and then people have to create tools to assert that two ISSNs are the "same".
We seemed determined to make this harder than it needs to be, especially for end users.
> However, Rod is correct about the poor history (although I don't
> consider LSIDs actionable, they are about as actionable as the text
> strings - you can machine-resolve both, but...).
No, they are actionable, you just need the right tools. Despite the fact I think they suck, I consume them in numerous projects.
> I just thing Rod should call for re-use of identifiers where it is
> believed to be identical for all purposes and new identifiers PLUS
> sameAs relations where uncertain.
I'd make it simpler, just re-use identifiers wherever possible. If your metadata includes a bibliographic citation, include whatever bibliographic identifiers you can find (DOI, PubMed, ISSN, ISBN, etc.). Same for taxonomic names, if you include a name include associated identifiers. That way this cloud of data we are generating has a fighting chance of coalescing.
I guess I'm letting my frustration show, but every time we introduce another layer of complexity, another epicycle in our models, a kitten dies.
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK
Email: r.page at bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792
AIM: rodpage1962 at aim.com
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the tdwg-tag