Donald wrote_
Considering SDD as an example, my interpretation of this is that the RDFS or other definitions for classes such as Character, State or Modifier should be accessible through URLs. This does not mean that the instances of these classes (i.e. what I might call the individual SDD data elements) need themselves to be accessible in the same way.
I do understand that Gregor is concerned about our ability to reason over the data as well as to validate the underlying documents. However, if we do choose to use technologies such as OWL to support such reasoning, I do not believe that we can expect to reason over fully federated data. Surely we would expect to resolve the data (through LSIDs, PURLs, or whatever else) and then process them locally? We should certainly consider whether this is an issue, but we should keep it separate from the main issue identified by the TAG.
I fail to understand the distinction between class and instance. So far I understand that it is arbitrary, depending on the purpose. The class "SDD.Character" is expressed in an xml instance document (a w3c schema document, using instances of classes expressed in the w3c-schema-schema). So is the class "flower color" which is an instance (or "data" in Donald's sense) expressed in an SDD instance document.
However, a taxon description makes use of "flower color" in the sense of a class definition, i.e. all descriptions using this term with a specific value are instances of the class flower color.
"Flower color" can be generalized to "color of flower-like structure". This often makes a lot of sense - e.g. in compositae (sunflower, etc.) many people will give answer about the inflorescence rather than about a color, and even botany students get confused about the cythium of Euphorbia). So we do want to make use of reasoning engines when processing taxon identification queries.
Exactly the same generalization relationships hold for taxa.
What I try to elicit here is that my perception is that those charged with developing the GBIF software use the distinction from their perspective - which is a good thing - but that it seems that we might be going down a way that prevents us from ever changing the perspective by requesting the use of LSIDs for what from the software development perspective is currently perceived an instance. If my information is correct, this would exactly prevent the use of standard reasoners to answer questions such as I posed in my talk prepared for TAG-1 (can not post to the email list, only 200 kB allowed).
I also suspect that LSIDs may be a really good way for us to handle many of our controlled vocabularies. Obviously those vocabularies which make up the definition of classes and their properties may need URL access, but in many other contexts (including, I would have thought, SDD) it may be more sensible to treat the vocabulary terms as data objects. This will allow us to extend them with all kinds of metadata.
I would say that the major reason the GUID meetings avoided adopting PURLs was simply that they give us no clean separation between the identifier and the owner and location of the document. LSIDs (provided we sort out appropriate best practices for how they are constructed) may, among other things, give us an intermediate layer we can conveniently manage to handle this.
Can you explain this. I believe a purl does exactly this. If I have purl.org/xyz it is an id, but the document may be anywhere and may in fact move.
Thanks!
Gregor---------------------------------------------------------- Gregor Hagedorn (G.Hagedorn@bba.de) Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Königin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203