Rod's point that "they offer no added value" reflects pretty much the discussion we have on names. Nobody asks the question "name of what?". Names are linked to observations ie specimens. Any identifier, or indeed the underlying treatment, that does not deliver this data is up in the air and thus meaningless. Not just for the publisher who wants to use the name, but also the taxonomists, who can not discuss the meaning of names. Discussions are done through publications and they don't allow this either at the very moment, not for the machine (or at best with a considerable, prohibitive effort), nor people.
A simple solution would be that in future to each usage of a name in taxonomic literature, at least for treatments, the name has to be explicitly linked to the underlying specimen or observation data. Since this is part of a treatment, the identifier should allow to retrieve the treatment and through it the specimen data or at least the observation data and the author who used it in this specific selection.
Donat
Why aren't identifiers reused?
Because in most cases they offer no added value. If I have a ITIS TSN there's not much I can do with it. I can get a name from ITIS (with some vague assurance that thus name is accepted - or not - with no evidence for this assertion). I can think of only two taxonomic identifiers that have real value and get much reuse, NCBI taxonomy ids and uBio NameBankIDs. NCBI ids are reused because they underpin the genomics databases, and genomics does real computational biology, and makes extensive reuse of data (as exemplified by the annual Nucleic Acids Research database issue). uBio NameBankIDs get reused because uBio has lots of names, and provides services for discovering those names in text (see, e.g., their use by BHL).
Few taxonomic name databases provide a compelling reason for anyone to use their identifiers, most being digital sinks (you go there, get an identifier for a name, and nothing else).
Why UUIDS?
UUIDs are ugly, and solve a problem that for the most part we don't have. They are ideal for minting globally unique identifiers in a distributed system, but we don't have distributed systems. Catalogue of Life uses UUIDs, but these are centrally created (I suspect using MySQL's UUID function, given how similar the UUIDs are to each other). ZooBank uses them, but it is not (yet) a distributed system. If the Catalogue of Life were genuinely a distributed system UUIDs would make sense, but that's not actually how it works.
I think users would cope with UUIDs if the databases using them provides clear value. For example, MusicBrainz uses UUIDs http://musicbrainz.org/artist/563201cb-721c-4cfb-acca-c1ba69e3d1fb.html , as does Mendeley, the latter hiding them from users via human- readable URLs. Given that we have obvious user-friendly candidates for URLs (taxonomic names), it would be trivial to hide UUIDs in names (making homonyms distinct by adding authorship, or whatever it took to make them unique as strings).
What, if anything, is a taxonomic name?
In my experience, when non-taxonomists meet taxonomists things get ugly. For example, a publisher wanting to mark up taxonomic names in text might ask taxonomists how to do this,and within minutes the taxonomists are off into discussions of namestrings versus usages versus concepts and pretty soon the publisher deeply regrets ever asking the question. I've been at meetings where the look in publishers' eyes said "run away, run away".
Part of the reason we have multiple databases is because different projects are capturing different things (roughly speaking, uBio is mostly about namestrings, Catalogue of Life is about concepts, IPNI and ZooBank are about first usage of a name, etc.)
Most users outside our field won't give a damn about the niceties of these distinctions, yet we persist in discussing them ad nauseam. Until we provide a single, very simple service that takes a name string and hides all this complexity (unless the user chooses to see the gory details) while still providing useful information, we will be stuck in multiple identifier hell. The tragedy is we've never had more people genuinely interested in linking to names than at present -- publishers are desperately trying to add "semantic value" to their content, and we are spectacularly ill-equipped to deliver this (and it's our own fault).
I rather suspect we're rapidly approaching the point where users outside taxonomy will simply say "to hell with these taxonomists, let's just use Wikipedia and be done with it."
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content