Hi All,
I'd like to re-ignite a debate we had over the summer in the light of what was discussed at Bratislava and things I am thinking about now.
The original way the Species Profile Model was structure was to have something like the attached UML diagram tagging.png
We didn't define a class "Category" but just left it as being any valid URI in the RDF style of doing things.
Markus and several others pointed out that this would not work with TAPIR very well as all the InfoItems will look the same. Something like this:
<SpeciesProfileModel> <hasInformation> <InfoItem> <category rdf:resource="some.resource.org/123" /> <....> </InfoItem> </hasInformation> <hasInformation> <InfoItem> <category rdf:resource="some.resource.org/124" /> <....> </InfoItem> </hasInformation> <hasInformation> <InfoItem> <category rdf:resource="some.resource.org/125" /> <....> </InfoItem> </hasInformation> </SpeciesProfileModel>
The xpaths to the data represented by <...> would all be the same.
They suggested something like the next diagram:
Where info item is subclassed. This allows RDF to be serialized like this:
<SpeciesProfileModel> <hasInformation> <Ecology> <....> </Ecology > </hasInformation> <hasInformation> <Behaviour> <....> </Behaviour> </hasInformation> <hasInformation> <BehaviouralEcology> <....> </BehaviouralEcology> </hasInformation> </SpeciesProfileModel>
The xpaths to the <...> are all different and the model becomes TAPIR friendly.
At the time I wasn't too bothered either way because I saw this as an artifact of serialization. The same thing could be written.
<SpeciesProfileModel> <hasInformation> <InfoItem> <rdf:type rdf:resource="some.resource.org/Ecology" /> <....> </InfoItem> </hasInformation> <hasInformation> <InfoItem> <rdf:type rdf:resource="some.resource.org/Behaviour" /> <....> </InfoItem> </hasInformation> <hasInformation> <InfoItem> <rdf:type rdf:resource="some.resource.org/BehaviouralEcology" /> <....> </InfoItem> </hasInformation> </SpeciesProfileModel>
With more or less that same meaning and looking just like a tagging example.
This was a mistake on my part because of course it isn't the same meaning just a similar serialization - an RDF type really needs to point to a class and can't point to any old vocabulary that it could if it was treated like a tag.
I have just added the category property back into InfoItem for discussion purposes.
I am concerned about going down the subclassing route for a few reasons.
1) Building a class hierarchy is difficult. The discussions we had in Bratislava highlighted this. The examples put together in the current vocabulary sparked a debate as to how things should be organized. There are issues with the diamond problem in multiple inheritance - if contradictory semantics occur in multiple inheritance routes to a class which has priority? If we don't have multiple inheritance how do we have InfoItems that are about both Ecology and Behaviour for example or any other two subjects such as Dispersal and Asexual reproduction.
2) If a class hierarchy is required for analysis/inference it can be superimposed on a tag based transfer protocol using OWL necessary and sufficient properties. In fact this is arguably a better way of approaching the situation than building a class hierarchy a priori.
3) The motivation for going down the subclassing route is largely so it will work with the transport protocol - which sets alarm bells ringing for me.
4) It precludes us from using something like SKOS to build our categories.
"SKOS or Simple Knowledge Organisation System is a family of formal languages designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is built upon RDF and RDFS, and its main objective is to enable easy publication of controlled structured vocabularies for the Semantic Web. SKOS is currently developed within the W3C framework." (http://en.wikipedia.org/wiki/SKOS)
I would really like to be able to say to "domain experts" that they "just" need to build a SKOS vocabulary for the terms used in their domain and then use the URIs of these terms for tagging information that is passed between providers than specify that they need to construct a more formal ontology of classes. I appreciate that one could argue that SKOS terms are like classes but they form part of a structure that is specifically designed for having the debates that TDWG needs to have around the vocabulary part of the standards (rather than the exchange parts).
I would be grateful for peoples thoughts and criticisms.
All the best,
Roger