[tdwg-tag] Fwd: LSIDs and ontology segmentation
RichardsK at landcareresearch.co.nz
Thu Jul 13 23:32:31 CEST 2006
Im not sure how many of you follow the mailing list <public-semweb-lifesci at w3.org>, but here is an interesting recent posting regarding LSIDs and ontology documents.
This group is currently in a similar situation to us, debating the best approach for GUIDs and sorting out ontologies. It includes the BioRDF group that Suzie Stevens spoke of at the RDF workshop in Edinburgh (for those of you who attended).
They are looking at LSIDs but as always there are people for and against the technology.
The thread started by the following post goes on with some interesting arguments for and against using LSIDs for identifiers to "ontology fragments". You can see more at http://lists.w3.org/Archives/Public/public-semweb-lifesci/2006Jul/0076.html.
>>> Mark Wilkinson <markw at illuminae.com> 14/07/2006 5:01:33 a.m. >>>
I chatted with Sean Martin yesterday and he indicated that, on the last
SWLS teleconference, he mentioned one of the ideas that my lab and his
group have been tossing around for the past few weeks v.v. LSIDs
representing ontology nodes. He asked me to fill-in some more detail in
a message to this mailing list.
In a publication that will be available soon  we (briefly) discuss
the problem of actually *using* the currently available ontologies in a
"real" Semantic Web setting - i.e. dynamically downloading whatever
ontologies are necessary given the predicates that you find in some
discovered RDF instance document. The OWL representation of GO is over
10 Meg... for heavens sake!... and GO is a small ontology compared to
things like the NCI Metathesaurus.
The problem with using document#fragment URLs to identify ontology nodes
is that the defined behaviour for resolving such an identifier is to
drop the fragment (since that isn't available server-side anyway) and to
return the entire document... all 10Meg's of GO... each time... We
would argue, therefore, that the URL (if you adopt its default
behaviour) is not only a bit of a nuisance, it is a blocker in some/many
There's been some exciting work in the domain of ontology segmentation
[2,3,4,5] that, we believe, is perhaps a more rational way of working
with these massive ontologies when you need to get on-the-fly access to
only the portions of the ontology that are relevant to your Blackberry's
agent at that moment. I know that others (e.g. Damian Gessler and
collaborators at NCGR, but I don't have the reference to his submitted
manuscript at hand right now... sorry Damian!) are also working on the
problem of segmentation by passing a self-inflating "flattened" ontology
fragment. The problem is that there is no Semantic Web-style protocol
available to specify that this is the behaviour you want, or for the
agent to know that this is the behaviour to expect. Some of these
projects are setting up the ontology fragment-generator as a Web Service
(if I recall correctly, Rector's group does this ), however this
doesn't solve the SW problem either because we can't (easily) model a
Web Service invocation as a single URI (at least, not by any existing
standard or convention... I guess some long REST-style URLs could do
Here is where I think the LSID could really shine! Unlike a URL, the
LSID does not have to return an entire document in response to a
getMetaData call. Thus, if an LSID were used as the identifier for an
ontology node, the behaviour of the getMetadata call could be, by
convention or by standard, to return only the relevant ontology
fragment, where that fragment was generated by e.g. the Rector
Segmentation generator in the background.
These were just early thoughts we've been having, but Sean asked me to
share them with the group in hopes of fanning the flames of discussion
and debate. It seems to me to be a "blocker" issue when it comes to
deploying SW applications in the wild, and I know that projects like
Damian's Semantic MOBY have hit this problem early and hard, as have I
in my own sandbox. It's all well and good when we play SW on our own
local machine, but as soon as we try to play SW in the wi(l)der world
this problem cripples us almost instantly. We think the LSID is (a/the)
solution to this problem, but no solution will be useful if it doesn't
have wider adoption, so...
 Good, B, Wilkinson, M. (in press). The Life Sciences Semantic Web is
Full of Creeps! Briefings in Bioinformatics.
 Noy, N, Musen, M. Specifying Ontology Views by Traversal. 2004.
 Alani, H, Harris, S, O'Neil, B. Ontology Winnowing: A Case Study on
the AKT Reference Ontology. 2005.
 Seidenberg, J, Rector, A (2006), 'Web Ontology Segmentation:
Analysis, Classification and Use', World Wide Web, ACM, Edinburgh,
 Stuckenschmidt, H, Klein, M. Structure-Based Partitioning of Large
Concept Hierarchies. 2004.
Asst. Professor, Dept. of Medical Genetics
University of British Columbia
PI in Bioinformatics, iCAPTURE Centre
St. Paul's Hospital, Rm. 166, 1081 Burrard St.
Vancouver, BC, V6Z 1Y6
tel: 604 682 2344 x62129
fax: 604 806 9274
"Since the point of a definition is to explain the meaning of a term to
someone who is unfamiliar with its proper application, the use of
language that doesn't help such a person learn how to apply the term is
pointless. Thus, "happiness is a warm puppy" may be a lovely thought,
but it is a lousy definition."
Köhler et al, 2006
WARNING: This email and any attachments may be confidential and/or
privileged. They are intended for the addressee only and are not to be read,
used, copied or disseminated by anyone receiving them in error. If you are
not the intended recipient, please notify the sender by return email and
delete this message and any attachments.
The views expressed in this email are those of the sender and do not
necessarily reflect the official views of Landcare Research.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the tdwg-tag