Re: [Tdwg-tag] Why we should not use LSID

3 May 2006

      Donald wrote_
...
Considering SDD as an example, my interpretation of this is that the RDFS or
other definitions for classes such as Character, State or Modifier should be
accessible through URLs.  This does not mean that the instances of these
classes (i.e. what I might call the individual SDD data elements) need
themselves to be accessible in the same way.
I do understand that Gregor is concerned about our ability to reason over
the data as well as to validate the underlying documents.  However, if we do
choose to use technologies such as OWL to support such reasoning, I do not
believe that we can expect to reason over fully federated data.  Surely we
would expect to resolve the data (through LSIDs, PURLs, or whatever else)
and then process them locally?  We should certainly consider whether this is
an issue, but we should keep it separate from the main issue identified by
the TAG.
I fail to understand the distinction between class and instance. So far I 
understand that it is arbitrary, depending on the purpose. The class 
"SDD.Character" is expressed in an xml instance document (a w3c schema 
document, using instances of classes expressed in the w3c-schema-schema). So is 
the class "flower color" which is an instance (or "data" in Donald's sense) 
expressed in an SDD instance document. 

However, a taxon description makes use of "flower color" in the sense of a 
class definition, i.e. all descriptions using this term with a specific value 
are instances of the class flower color.

"Flower color" can be generalized to "color of flower-like structure". This 
often makes a lot of sense - e.g. in compositae (sunflower, etc.) many people 
will give answer about the inflorescence rather than about a color, and even 
botany students get confused about the cythium of Euphorbia). So we do want to 
make use of reasoning engines when processing taxon identification queries.

Exactly the same generalization relationships hold for taxa.

What I try to elicit here is that my perception is that those charged with 
developing the GBIF software use the distinction from their perspective - which 
is a good thing - but that it seems that we might be going down a way that 
prevents us from ever changing the perspective by requesting the use of LSIDs 
for what from the software development perspective is currently perceived an 
instance. If my information is correct, this would exactly prevent the use of 
standard reasoners to answer questions such as I posed in my talk prepared for 
TAG-1 (can not post to the email list, only 200 kB allowed).
...
I also suspect that LSIDs may be a really good way for us to handle many of
our controlled vocabularies.  Obviously those vocabularies which make up
the definition of classes and their properties may need URL access, but in
many other contexts (including, I would have thought, SDD) it may be more
sensible to treat the vocabulary terms as data objects.  This will allow us
to extend them with all kinds of metadata.
...
I would say that the major reason the GUID meetings avoided adopting PURLs
was simply that they give us no clean separation between the identifier and
the owner and location of the document.  LSIDs (provided we sort out
appropriate best practices for how they are constructed) may, among other
things, give us an intermediate layer we can conveniently manage to handle
this.
Can you explain this. I believe a purl does exactly this. If I have 
purl.org/xyz it is an id, but the document may be anywhere and may in fact 
move.

Thanks!

Gregor----------------------------------------------------------
Gregor Hagedorn (G.Hagedorn@bba.de)
Institute for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Königin-Luise-Str. 19           Tel: +49-30-8304-2220
14195 Berlin, Germany           Fax: +49-30-8304-2203