Dear Steve,
Yep, the Emperor has no clothes.
I don't think the failure of LSIDs is completely down to LSID-specific issues. HTTP URIs in the linked data sense have their own issues, and if you follow the technical discussions every so often people get very agitated about the (in)famous HTTP 303 redirect solution to information versus non-information resources. DOIs are also technically non-trivial to implement.
I suspect the failure of LSIDs is due to a combination of issues:
1. Most biodiversity data providers are small, poorly resourced operations that can't guarantee 24/7 operations
2. Nobody provides central support/monitoring of LSIDs (i.e., there is no CrossRef equivalent where you can report a broken LSID and have a reasonable expectation that someone will fix it).
3. The community as a whole doesn't value unique identifiers (if they did, we wouldn't be in this mess)
4. There are no LSID consumers. The scientific publishing industry's linking infrastructure depends on DOIs, publishers use CrossRef services as part of their publishing workflow. There is no equivalent for our field (another way of restating #3).
These issues won't go away simply by replacing LSIDs with HTTP URIs. Some of the linked data sources go belly up fairly regularly (notably http://bio2rdf.org ).
Regarding specifics, Catalogue of Life LSIDs have been broken for over a year (see http://iphylo.blogspot.com/2010/01/thoughts-on-international-year-of.html ). Nobody seems to care (see #3 above).
IPNI supports versioning of LSIDs, but the LSID without the version is also valid (it resolves to the latest version). I think this is a perfectly valid thing to do. Versioning matters to some, but not everyone cares about it, IPNI supports both views. If you want to cite a specific version, do so, if not, don't.
Index Fungorum's web services are broken (see #1 above).
The data/metadata distinction is a complete red herring and has side- tracked more LSID conversations than I care to remember. The distinction stems from LSIDs originally being conceived as identifiers for large data objects (e.g., sequences, images, binary streams) that people want accurately versioned so that could reproduce digital experiments. In this scenario, metadata can change because it's not what mattered for this reproducibility. Some people think of data about taxonomic names as "metadata", and then we're off on a wild goose chase about should LSIDs change when metadata changes. I wouldn't loose any sleep over this (see discussion of IPNI above).
uBio is one of the few success stories of biodiversity informatics (props to Dave Remsen and David Patterson for setting this up, and people like Anthony Goddard for keeping it running). They've had LSIDs running since about 2005, and pretty much the only reason I keep my LSID server running is because they use some vocabulary terms I coined way back when.
Regarding DOIs for books, these are uncommon, although there are moves to express ISBNs within the DOI framework, so there may be more of these. But for a DOI to exist someone has to claim ownership of a resource and register the DOI with CrossRef. For a lot of older literature there won't be a publisher around to do this, but that's where BHL comes in.
In summary, we're in a mess, and I don't think this is really down to technology. It's a failure of our community to create the appropriate resources (e.g., centralised, curated resources of identifiers and associated metadata for names, publications, and specimens).
Regards
Rod
On 6 Jan 2011, at 05:32, Steve Baskauf wrote:
Well, I have continued my quest for resolvable, RDF-producing GUIDs for taxon/name-related stuff. I have gotten a lot of good information from reading Rod Page's BMC Bioinformatics paper (http://dx.doi.org/10.1186/1471-2105-10-S14-S5) and from investigating his http://bioguid.info/ site.
From the standpoint of the "sec./sensu" part of a TNU/taxon concept, based on the recent discussion, it sounds like the DIO solution for publications is a good direction to go IF resolution services producing RDF comes into existence and IF it becomes possible to actually search for the DIOs of more obscure publications. I tried using Rod's site to look up a journal article using the ISSN, volume, and page and the web interface found the DOI and generated RDF just fine. However, an attempt to use the web to find the DIO of Gleason and Cronquist's Manual of vascular plants of Northeastern United States and adjacent Canada failed despite a half hour of effort (I found the UPC, the LOC call number, and the ISBN, but no DOI). Maybe there just isn't a DOI for it but there should be a way for me to know that. So DOIs for books and old journal articles are not really ready for prime time.
From the standpoint of the "scientific name" part of a TNU/taxon concept, I had better luck (sort of). Rod's "Status of biodiversity services" page (http://www.bioguid.info/status/) was really cool. I saw resources I hadn't known about before. I tried out several of the services that claimed to issue LSIDs.
Catalog of Life's LSIDs didn't work with either the http://www.bioguid.info/ or http://lsid.tdwg.org/ proxies with either a web browser or the OpenLink RDF browser. I only got an empty RDF element in response.
Index Fungorum was down.
IPNI seemed to work. However, I was somewhat appalled to observe that they seem to change the revision identifier any time that they change any part of the metadata. That renders the LSID useless as a permanent GUID for the name and I believe is inconsistent with the design of LSIDs where the revision is only supposed to change if the underlying data itself (NOT metadata) changes. (Catalog of Life says that they change the revision identifier EACH YEAR for all of their records! That's even worse!) If I'm remembering the TDWG LSID recommendations, it is not even recommended to use the revision part of an LSID at all in the biodiversity informatics context.
ubio.org's LSIDs seemed to work properly.
[sorry - didn't try zoobank since I was looking for plants]
I don't know which (if any) of the Web sites listed on Rod's status page use generic HTTP URI guids (rather than LSIDs) to refer to taxon names. I tried out the Global Names index that Pete was mentioning. The URI version of the UUIDs (e.g. http://gni.globalnames.org/name_strings/3a70f04d-fd29-5570-ba91-52dae0c3d07f) do resolve under content negotiation, but the only useful information that the RDF representation seems to provide is the actual name string that was used to generate the UUID. Until some other useful linked information is added to the RDF, there doesn't seem to be much advantage in pointing a semantic client to the URI over just using a string literal for the name.
So the bottom line is that of the LSID services for names that I've tried so far, only ubio.org seems to have LSIDs for names that are unchanging, can work as a proxied URI, and that produce actual useful RDF. That's pretty disappointing given the apparently huge amount of work that's been put into building these various systems.
Steve
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707 http://bioimages.vanderbilt.edu
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html