Based on playing with LSIDs as part of the Taxonomic Search Engine
(http://darwin.zoology.gla.ac.uk/~rpage/portal ), and following some
of the recent efforts on developing taxonomic name schema (such as
the LinneanCore), I've become concerned that I don't think the
implications of LSIDs have been fully thought through.
Specifically, I worry that by focusing on schema for names in their
present format, the community is making more work for itself, and
missing the point (and potential) of LSIDs.
Metadata
--------
LSIDs are not simply identifiers, they come with associated metadata.
For example, the LSID
urn:lsid:ipni.org.lsid.zoology.gla.ac.uk:Id:20012729-1 has
associated metadata which can be viewed directly at
http://ipni.org.lsid.zoology.gla.ac.uk/authority/metadata/?lsid=urn:lsid:ip…
, or by using a LSID resolver
(http://biopathways.ibm.nebiogrid.org/resolver/urn:lsid:ipni.org.lsid.zoolog…
).
This metadata is in RDF (Resource Description Format), and provides
information about the name, and links to other resources (via LSIDs).
For example, the above record for *Poissonia heterantha*" has a link
to its basionym, *Tephrosia heterantha*.
We've been here before
-----------------------
It seems to me that there is an assumption that all we do with LSIDs
is stick them in an XML document as an identifier ("GUID"), and our
work is done. I feel this rather misses the point.
The key point is that, if we serve LSIDs we need to serve metadata
about the names. So, we need a standard for the metadata. But, hang
on, we've just spent energy on a standard for our data...? So, once a
schema is agreed, someone then someone has to create a new schema for
the metadata for LSIDs..? And, how do these schema relate...? Hmmmm.
One response to these ideas might be to simply serve a document based
on one of the current schema (such as the LinneanCore) as metadata.
But I think is a poor solution that doesn't exploit the potential of
RDF metadata.
RDF
---
RDF is very cool, in that there are tools that can take RDF and
reason about them. For instance, given metadata for the LSIDs
urn:lsid:ipni.org.lsid.zoology.gla.ac.uk:Id:20012728-1 (Poissonia
heterantha), and urn:lsid:ipni.org.lsid.zoology.gla.ac.uk:Id:944651-1
(Coursetia heterantha), we can infer that these two names are
synonyms, because they share the same basionym.
If we use RDF, we get this kind of ability for "free." There is a lot
of work in the semantic web, onotology, and bioinformatics
communities about making inferences like this. Isn't this the kind of
thing we want to do, rather than simply pass XML documents around?
Wouldn't it be nice to take two or more LSIDs and workout their
relationship, automatically (where possible). Client databases that
stored LSIDs could work out whether names were synonyms in a standard
way, without actually having to be told that the names are synonyms.
A radical view
--------------
Instead of developing schema for exchanging information that are
specific to taxonomy, a radical approach would be to adopt LSIDs as
identifiers and standardise the associated metadata (which would
convey all the information about the name). By adopting RDF we can
tap into a lot of existing work, as well as existing external
standards (e.g., Dublin core, and the emerging use of LSIDs in
bioinformatics). It also offers an opportunity to serve up
information on taxonomic names in a much more useful form than a
simple XML document.
I wonder if an opportunity is being missed here.
Regards
Rod
--
--------------------------------------------------------
Professor Roderic D. M. Page
Editor Elect, Systematic Biology
DEEB, IBLS
Graham Kerr Building
University of Glasgow
Glasgow G12 8QP
United Kingdom
Phone: +44 141 330 4778
Fax: +44 141 330 2792
email: r.page(a)bio.gla.ac.uk
web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic
Biologists Website: http://systematicbiology.org
Search for taxonomic names at http://darwin.zoology.gla.ac.uk/~rpage/portal
Because I can't spell to save myself, I've added a "did you mean"
feature to the Taxonomic Search Engine
(http://darwin.zoology.gla.ac.uk/~rpage/portal/ ). You now have the
option of asking for suggested spellings, based on a list of names I
have stored on the server, as well as any suggestions offered by
Google.
For example, if you search on "Physeter catadon", you will be asked
if you really meant "Physeter catodon" (note the "o" instead of the
"a").
Both the spelling suggestion, and the name search itself are now
available as web services. Documentation and example clients are
available at the site (I hope to flesh these out as time permits).
Regards
Rod
--
--------------------------------------------------------
Professor Roderic D. M. Page
Editor Elect, Systematic Biology
DEEB, IBLS
Graham Kerr Building
University of Glasgow
Glasgow G12 8QP
United Kingdom
Phone: +44 141 330 4778
Fax: +44 141 330 2792
email: r.page(a)bio.gla.ac.uk
web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic
Biologists Website: http://systematicbiology.org
http://blade63.cs.umb.edu:13200/axis/ITIS
has a limited web interface to an experimental SOAP Web Service against
a copy of ITIS data running on an Oracle installation here.
The WSDL is pointed to on that page, and we welcom experiments with
clients other than our own.
We don't keep the data particularly current, and the service is in use
by my course, so don't expect production quality. But please do try to
break it and send me mail if you succeed.
So, where is the XML-Schema for ITIS query results? Hah, hah just serious.
Bob Morris