LSIDs and taxonomic schema - an alternative view

Roderic D. M. Page r.page at BIO.GLA.AC.UK
Mon Nov 15 21:28:44 CET 2004


Based on playing with LSIDs as part of the Taxonomic Search Engine
(http://darwin.zoology.gla.ac.uk/~rpage/portal ), and following some
of the recent efforts on developing taxonomic name schema (such as
the LinneanCore), I've become concerned that I don't think the
implications of LSIDs have been fully thought through.

Specifically, I worry that by focusing on schema for names in their
present format, the community is making more work for itself, and
missing the point (and potential) of LSIDs.

Metadata
--------

LSIDs are not simply identifiers, they come with associated metadata.
For example, the LSID
urn:lsid:ipni.org.lsid.zoology.gla.ac.uk:Id:20012729-1 has
associated metadata which can be viewed directly at
http://ipni.org.lsid.zoology.gla.ac.uk/authority/metadata/?lsid=urn:lsid:ipni.org.lsid.zoology.gla.ac.uk:Id:20012728-1
, or by using a LSID resolver
(http://biopathways.ibm.nebiogrid.org/resolver/urn:lsid:ipni.org.lsid.zoology.gla.ac.uk:20012728-1
).

This metadata is in RDF (Resource Description Format), and provides
information about the name, and links to other resources (via LSIDs).
For example, the above record for *Poissonia heterantha*" has a link
to its basionym, *Tephrosia heterantha*.


We've been here before
-----------------------

It seems to me that there is an assumption that all we do with LSIDs
is stick them in an XML document as an identifier ("GUID"), and our
work is done. I feel this rather misses the point.

The key point is that, if we serve LSIDs we need to serve metadata
about the names. So, we need a standard for the metadata. But, hang
on, we've just spent energy on a standard for our data...? So, once a
schema is agreed, someone then someone has to create a new schema for
the metadata for LSIDs..? And, how do these schema relate...? Hmmmm.

One response to these ideas might be to simply serve a document based
on one of the current schema (such as the LinneanCore) as metadata.
But I think is a poor solution that doesn't exploit the potential of
RDF metadata.

RDF
---

RDF is very cool, in that there are tools that can take RDF and
reason about them. For instance, given metadata for the LSIDs
urn:lsid:ipni.org.lsid.zoology.gla.ac.uk:Id:20012728-1 (Poissonia
heterantha), and urn:lsid:ipni.org.lsid.zoology.gla.ac.uk:Id:944651-1
(Coursetia heterantha), we can infer that these two names are
synonyms, because they share the same basionym.

If we use RDF, we get this kind of ability for "free." There is a lot
of work in the semantic web, onotology, and bioinformatics
communities about making inferences like this. Isn't this the kind of
thing we want to do, rather than simply pass XML documents around?
Wouldn't it be nice to take two or more LSIDs and workout their
relationship, automatically (where possible). Client databases that
stored LSIDs could work out whether names were synonyms in a standard
way, without actually having to be told that the names are synonyms.


A radical view
--------------

Instead of developing schema for exchanging information that are
specific to taxonomy, a radical approach would be to adopt LSIDs as
identifiers and standardise the associated metadata (which would
convey all the information about the name). By adopting RDF we can
tap into a lot of existing work, as well as existing external
standards (e.g., Dublin core, and the emerging use of LSIDs in
bioinformatics). It also offers an opportunity to serve up
information on taxonomic names in a much more useful form than a
simple XML document.

I wonder if an opportunity is being missed here.

Regards

Rod


--
--------------------------------------------------------
Professor Roderic D. M. Page
Editor Elect, Systematic Biology
DEEB, IBLS
Graham Kerr Building
University of Glasgow
Glasgow G12 8QP
United Kingdom


Phone:    +44 141 330 4778
Fax:      +44 141 330 2792
email:    r.page at bio.gla.ac.uk
web:      http://taxonomy.zoology.gla.ac.uk/rod/rod.html
reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html

Subscribe to Systematic Biology through the Society of Systematic
Biologists Website:  http://systematicbiology.org

Search for taxonomic names at http://darwin.zoology.gla.ac.uk/~rpage/portal




More information about the tdwg-content mailing list