[tdwg-content] most GUIDs/URIs for names/taxon stuff not ready for prime time
r.page at bio.gla.ac.uk
Thu Jan 6 09:28:09 CET 2011
Yep, the Emperor has no clothes.
I don't think the failure of LSIDs is completely down to LSID-specific
issues. HTTP URIs in the linked data sense have their own issues, and
if you follow the technical discussions every so often people get very
agitated about the (in)famous HTTP 303 redirect solution to
information versus non-information resources. DOIs are also
technically non-trivial to implement.
I suspect the failure of LSIDs is due to a combination of issues:
1. Most biodiversity data providers are small, poorly resourced
operations that can't guarantee 24/7 operations
2. Nobody provides central support/monitoring of LSIDs (i.e., there is
no CrossRef equivalent where you can report a broken LSID and have a
reasonable expectation that someone will fix it).
3. The community as a whole doesn't value unique identifiers (if they
did, we wouldn't be in this mess)
4. There are no LSID consumers. The scientific publishing industry's
linking infrastructure depends on DOIs, publishers use CrossRef
services as part of their publishing workflow. There is no equivalent
for our field (another way of restating #3).
These issues won't go away simply by replacing LSIDs with HTTP URIs.
Some of the linked data sources go belly up fairly regularly (notably http://bio2rdf.org
Regarding specifics, Catalogue of Life LSIDs have been broken for over
a year (see http://iphylo.blogspot.com/2010/01/thoughts-on-international-year-of.html
). Nobody seems to care (see #3 above).
IPNI supports versioning of LSIDs, but the LSID without the version is
also valid (it resolves to the latest version). I think this is a
perfectly valid thing to do. Versioning matters to some, but not
everyone cares about it, IPNI supports both views. If you want to cite
a specific version, do so, if not, don't.
Index Fungorum's web services are broken (see #1 above).
The data/metadata distinction is a complete red herring and has side-
tracked more LSID conversations than I care to remember. The
distinction stems from LSIDs originally being conceived as identifiers
for large data objects (e.g., sequences, images, binary streams) that
people want accurately versioned so that could reproduce digital
experiments. In this scenario, metadata can change because it's not
what mattered for this reproducibility. Some people think of data
about taxonomic names as "metadata", and then we're off on a wild
goose chase about should LSIDs change when metadata changes. I
wouldn't loose any sleep over this (see discussion of IPNI above).
uBio is one of the few success stories of biodiversity informatics
(props to Dave Remsen and David Patterson for setting this up, and
people like Anthony Goddard for keeping it running). They've had LSIDs
running since about 2005, and pretty much the only reason I keep my
LSID server running is because they use some vocabulary terms I coined
way back when.
Regarding DOIs for books, these are uncommon, although there are moves
to express ISBNs within the DOI framework, so there may be more of
these. But for a DOI to exist someone has to claim ownership of a
resource and register the DOI with CrossRef. For a lot of older
literature there won't be a publisher around to do this, but that's
where BHL comes in.
In summary, we're in a mess, and I don't think this is really down to
technology. It's a failure of our community to create the appropriate
resources (e.g., centralised, curated resources of identifiers and
associated metadata for names, publications, and specimens).
On 6 Jan 2011, at 05:32, Steve Baskauf wrote:
> Well, I have continued my quest for resolvable, RDF-producing GUIDs
> taxon/name-related stuff. I have gotten a lot of good information
> reading Rod Page's BMC Bioinformatics paper
> (http://dx.doi.org/10.1186/1471-2105-10-S14-S5) and from investigating
> his http://bioguid.info/ site.
> From the standpoint of the "sec./sensu" part of a TNU/taxon concept,
> based on the recent discussion, it sounds like the DIO solution for
> publications is a good direction to go IF resolution services
> RDF comes into existence and IF it becomes possible to actually search
> for the DIOs of more obscure publications. I tried using Rod's site
> look up a journal article using the ISSN, volume, and page and the web
> interface found the DOI and generated RDF just fine. However, an
> attempt to use the web to find the DIO of Gleason and Cronquist's
> of vascular plants of Northeastern United States and adjacent Canada
> failed despite a half hour of effort (I found the UPC, the LOC call
> number, and the ISBN, but no DOI). Maybe there just isn't a DOI for
> but there should be a way for me to know that. So DOIs for books and
> old journal articles are not really ready for prime time.
> From the standpoint of the "scientific name" part of a TNU/taxon
> concept, I had better luck (sort of). Rod's "Status of biodiversity
> services" page (http://www.bioguid.info/status/) was really cool. I
> resources I hadn't known about before. I tried out several of the
> services that claimed to issue LSIDs.
> Catalog of Life's LSIDs didn't work with either the
> http://www.bioguid.info/ or http://lsid.tdwg.org/ proxies with
> either a
> web browser or the OpenLink RDF browser. I only got an empty RDF
> element in response.
> Index Fungorum was down.
> IPNI seemed to work. However, I was somewhat appalled to observe that
> they seem to change the revision identifier any time that they change
> any part of the metadata. That renders the LSID useless as a
> GUID for the name and I believe is inconsistent with the design of
> where the revision is only supposed to change if the underlying data
> itself (NOT metadata) changes. (Catalog of Life says that they change
> the revision identifier EACH YEAR for all of their records! That's
> worse!) If I'm remembering the TDWG LSID recommendations, it is not
> even recommended to use the revision part of an LSID at all in the
> biodiversity informatics context.
> ubio.org's LSIDs seemed to work properly.
> [sorry - didn't try zoobank since I was looking for plants]
> I don't know which (if any) of the Web sites listed on Rod's status
> use generic HTTP URI guids (rather than LSIDs) to refer to taxon
> I tried out the Global Names index that Pete was mentioning. The URI
> version of the UUIDs (e.g.
> do resolve under content negotiation, but the only useful information
> that the RDF representation seems to provide is the actual name string
> that was used to generate the UUID. Until some other useful linked
> information is added to the RDF, there doesn't seem to be much
> in pointing a semantic client to the URI over just using a string
> literal for the name.
> So the bottom line is that of the LSID services for names that I've
> tried so far, only ubio.org seems to have LSIDs for names that are
> unchanging, can work as a proxied URI, and that produce actual useful
> RDF. That's pretty disappointing given the apparently huge amount of
> work that's been put into building these various systems.
> Steven J. Baskauf, Ph.D., Senior Lecturer
> Vanderbilt University Dept. of Biological Sciences
> postal mail address:
> VU Station B 351634
> Nashville, TN 37235-1634, U.S.A.
> delivery address:
> 2125 Stevenson Center
> 1161 21st Ave., S.
> Nashville, TN 37235
> office: 2128 Stevenson Center
> phone: (615) 343-4582, fax: (615) 343-6707
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK
Email: r.page at bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792
AIM: rodpage1962 at aim.com
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
More information about the tdwg-content