Re: [tdwg-content] Identifier limbo [SEC=UNCLASSIFIED]

7 Jun 2011

      Rod,

Dispair?  Except for DOI we do all of this already. Most of the
infrastructure has been in place for many years.  It has taken a long
time but the early work has attracted funds and we are now all out on
building on the underlying datasets.

We currently support URIs and URN:LSIDs but if DOIs add value and we
can afford to apply them perhaps we should add them too. We would need
a little under 1 million handles for AFD and APNI/APC if we dare to
try. As we continue your work to map nomenclatural and taxonomic
events into BHL where do we apply these DOI? To the indexed object or
the the place on the page?  Do we pretend that they are the same
thing?

In reallity it is the adoption of name and taxon objects that bring
re-usability. True citability will come when these objects are found
in taxonomic works.  Our indices are quite happy dealing with
prepackaged identifiers - even if they are DOIs.

At the nomenclator level we are pretty close to setting name objects
free to be the factual open community resource they were designed to
be - at least in the botanical domain.

I agree that versioning is not an issue ( twittered comments excepted
) - these objects re-present facts which we should be able to repair
together.  But concepts anchor value added product and they need to be
identified.  Ones choice of concept ( accepted/valid) depends largely
on context which is usually provided as a classification (albeit
fragmentary).   Maybe you're right and nobody cares and they just take
the current concept within some default context but over time we find
that documenting these choices turns out to be significant as concepts
and/or context drift  to accommodate current thinking.  Appropriate
infrastructure design supports re-use of these decisions anyway. The
scale here may be months or centuries - these principles still apply.

greg

On 6 June 2011 19:35, Roderic Page <r.page@bio.gla.ac.uk> wrote:
...
Reading this thread makes me despair. It's as if we are determined not to make progress, forever debating identifiers and what they identify, with seemingly little hope of resolution, and no clear vision of what the goals are. We wallow in acronym soup, and enjoy the technical challenges, but don't actually get anywhere (http://iphylo.blogspot.com/2010/04/biodiversity-informatic-fail-and-what.htm... )
Here's one vision of a way forward.
1. Someone or some entity shows a little vision and courage, and provides a taxonomic classification where every node in the tree gets a DOI.
2. The nodes in the taxonomic classification are citable objects (hence the DOIs).
3. Nodes in the classification can be accessed either by DOI or by name-based HTTP URI (multiple nodes for a name bounce to ambiguity resolution pages).
4. A taxon-name extraction service locates names in text.
5. We build a taxon name/literature index (which we pretty much have already, albeit distributed and partly proprietary).
Now, when an author (in any field) writes a paper or publishes some data on a taxon they cite the node in the classification as they would any scientific paper. Through CrossRef's citation tracking mechanism, the taxon database automatically accumulates the scientific literature relevant to that taxon.
When a journal publishes an article it calls the name extraction service to make the names clickable, avoiding the need for the journal to create its own taxon pages (a la Pensoft), and automatically building a taxonomic index to the literature. Publishers get enriched content, we get an always up to date taxonomic index.
So, we have services that authors and publishers can use, and use an identifier scheme that publishers understand (and so do some, if not most authors).
But you say "what about RDF and the linked web?". Relax, DOIs are now linked data compliant.
But you say "what about versions?" OK, to a first approximation nobody cares about versions. They really don't. Obviously previous versions will be accessible, but the identifier always points to the most recent version.
But you say "ah but it costs money". Yep, anything worthwhile does. Last time I checked journals costs money, yet we seem to have lots of those.
But you say, "which classification to use?" Does anybody (outside taxonomy) actually care? Classification is a navigational convenience. If you care deeply, you'll make a phylogeny.
But you say, "why DOIs?" Several reasons, 1) publishers and authors understand them, 2) they avoid branding 3) there's an infrastructure underpinning them 4) they show that we are serious
But "what about different taxon concepts?" To a first approximation nobody cares, and for the bulk of life we know too little for there to be much ambiguity. If we do care we can read the literature, which we have conveniently indexed.
Now, there are lots of things we could argue about, but if, say, EOL had done something like this at the start, namely embedded itself in the publication process, and major journals were citing EOL pages and linking to EOL pages, we would have a wonderful tool that was actually useful, way outside our own narrow concerns. Note that I'm using EOL has an example of an organisation with sufficient scope, GBIF would be another candidate.
I suspect a major reason for our continued failure is a the lack of clearly identified users (and I don't mean people who read this list, or TAXACOM), and a failure of ambition.
Anyway, my coffee has arrived. I don't hold out much hope of any of this happening, and I fully expect us to be debating these issues in a year from now. Pity.
Regards
Rod
---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792
AIM: rodpage1962@aim.com
Facebook: http://www.facebook.com/profile.php?id=1112517192
Twitter: http://twitter.com/rdmpage
Blog: http://iphylo.blogspot.com
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content
If you have received this transmission in error please notify us immediately by return e-mail and delete all copies. If this e-mail or any attachments have been sent to you in error, that error does not constitute waiver of any confidentiality, privilege or copyright in respect of information in the e-mail or attachments.
Please consider the environment before printing this email.
-- 
Greg Whitbread
Australian National Botanic Gardens
Australian National Herbarium
+61 2 62509482
ghw@anbg.gov.au

Re: [tdwg-content] Identifier limbo [SEC=UNCLASSIFIED]

greg whitbread