[tdwg-tag] Re: TDM Ontology

Markus Döring m.doering at BGBM.org
Fri May 4 13:46:51 CEST 2007


Hello,
some small corrections below regarding my understanding
--
Markus



On 04.05.2007, at 12:58, Roger Hyam wrote:

> Hi Markus,
>
> I am replying to this and cc'ing the TAG list because I really  
> think we should be having the discussion there. I am sure there are  
> other people who might like to be involve from a technical stand  
> point. I hope they can read this message thread backwards to catch up.
>
> If I can summarize:
>
> We are talking about the data models that were dreamt up at the  
> SpeciesDataModel workshop
>
> http://rs.tdwg.org/ontology/voc/TaxonDataModel
> http://rs.tdwg.org/ontology/voc/TDMTerm
>
> The choice is whether to have an inherited hierarchy of classes of  
> object to represent information items or to have a single  
> information item and 'tag' it with categories (instances).
The inherited hierarchy of classes is not needed for semantics in  
both approaches. We can derive *all* classes directly form the base  
InfoItem class. Just as you are planning to do with the controlled  
vocabulary terms, deriving them from BaseTerm. This kind of  
inheritance is more technical by nature simply passing on all  
properties of InfoItems. The semantical hierarchy InfoItem- 
 >BehaviorInfoItem->EvolutionaryBehaviorInfoItem can be modelled in  
both approaches if we want. But in both ways we dont have to!


> Having info items as different classes means that they would be  
> possibly be clearer in a straight serialization.
>
> <tdm:TaxonDataModel>
> 	<tdm:aboutTaxon>.....</tdm:aboutTaxon>
> 	<tdm:hasInformation>
> 		<tdmt:Behaviour>
> 			<tdmt:hasContent>Some stuff about behaviour</tdmt:hasContent>
> 		</tdmt:Behaviour>
> 	</tdm:hasInformation>
> 	<tdm:hasInformation>
> 		<tdmt:Evolution>
> 			<tdmt:hasContent>Some stuff about evolution</tdmt:hasContent>
> 		</tdmt:Evolutionr>
> 	</tdm:hasInformation>
> 	<tdm:hasInformation>
> 		<tdmt:BehaviouralEvolution>
> 			<tdmt:hasContent>Some stuff about evolution of behaviour</ 
> tdmt:hasContent>
> 		</tdmt:BehaviouralEvolution>
> 	</tdm:hasInformation>
> </tdm:TaxonDataModel>
>
> But taking the tagging approach:
>
> <tdm:TaxonDataModel>
> 	<tdm:aboutTaxon>.....</tdm:aboutTaxon>
> 	<tdm:hasInformation>
> 		<tdm:InfoItem>
> 			<tdm:category rdf:resource="http://rs.tdwg.org/ontology/voc/ 
> TDMTerm#Behaviour"/>
> 			<tdmt:hasContent>Some stuff about behaviour</tdmt:hasContent>
> 		</tdm:InfoItem>
> 	</tdm:hasInformation>
> 	<tdm:hasInformation>
> 		<tdm:InfoItem>
> 			<tdm:category rdf:resource="http://rs.tdwg.org/ontology/voc/ 
> TDMTerm#Evolution"/>
> 			<tdmt:hasContent>Some stuff about evolution</tdmt:hasContent>
> 		</tdm:InfoItem>
> 	</tdm:hasInformation>
> 	<tdm:hasInformation>
> 		<tdm:InfoItem>
> 			<tdm:category rdf:resource="http://rs.tdwg.org/ontology/voc/ 
> TDMTerm#Behaviour"/>
> 			<tdm:category rdf:resource="http://rs.tdwg.org/ontology/voc/ 
> TDMTerm#Evolution"/>
> 			<tdmt:hasContent>Some stuff about evolution of behaviour</ 
> tdmt:hasContent>
> 		</tdm:InfoItem>
> 	</tdm:hasInformation>
> </tdm:TaxonDataModel>
>
> Renato raised questions about serving that tagged version with  
> TAPIR by which I think he meant TAPIRLink as it would not be  
> possible to do the above example as a flat schema. This is the same  
> problem as serving multiple identifications for a specimen I guess  
> - is this right?
Not really. I think the problem has to do with creating an XML schema  
to be used for a TAPIR model that has *different* node paths to be  
mapped to. Using the generic InfoItem "metamodel" you always end up  
with this path that you map to:

/tdm:TaxonDataModel/tdm:hasInformation/tdm:InfoItem/tdmt:hasContent


instead of having different ones when using inheritance:
/tdm:TaxonDataModel/tdm:hasInformation/tdm:InfoItem/tdmt:Behaviour
/tdm:TaxonDataModel/tdm:hasInformation/tdm:InfoItem/tdmt:Evolution
/tdm:TaxonDataModel/tdm:hasInformation/tdm:InfoItem/ 
tdmt:BehaviouralEvolution


> Reminds me of the point  I think Markus raised it at the beginning.  
> Why not have InfoItem as the top level element and move the taxon  
> into it?
>
> <InfoItem>
> 	<aboutTaxon>...</aboutTaxon>
> 	<category rdf:resource="http://rs.tdwg.org/ontology/voc/ 
> TDMTerm#Evolution"/>
> 	<hasContent>Some stuff about evolution</hasContent>
> </InfoItem>
>
> Info item is then like a DwC record and the category property is  
> like the BasisOfRecord. (TAPIRLink couldn't do multiple category  
> stuff).
>
> The argument against this is that the metadata would have to be  
> repeated for multiple InfoItems. Most requests would be for  
> multiple InfoItems about the same species - I guess but I really  
> need clearer examples as to what this will be applied to. Who is  
> going to implement this in the near future? Perhaps they should  
> have a go and decide? Isn't Wouter doing something on it? I don't  
> have the time just now to try out some examples and I think that is  
> what is needed.
>
> What does everyone else think?
>
> All the best,
>
> Roger
>
>
> On 4 May 2007, at 10:56, Markus Döring wrote:
>
>> On 03.05.2007, at 11:18, Roger Hyam wrote:
>>
>>> Hi Markus,
>>>
>>> I have downsized the cc list for this discussion as I think it  
>>> may be just confusing to the less technically focussed or  
>>> otherwise involved people who would rather just hear the answer.
>>>
>>> I am not sure I totally follow you. Currently InfoItem.category  
>>> property has a range of DefinedTerm which means anything that it  
>>> should contain an instance of DefinedTerm - i.e. the simplified  
>>> controlled vocabulary things we are using. It should perhaps have  
>>> a range of http://rs.tdwg.org/ontology/voc/TDMTerm#TDMTerm.
>> yes, so the definition of what the infoitem is about is a separate  
>> ontology, thats what I naively called "domain" ontology before.  
>> This can be a simple list of terms or a hierarchical list. I  
>> suspect you aim at the flat list to avoid inheritance for reasons  
>> given in the TAG wiki.
>>
>>
>>> Are you suggesting that we have different InfoItem child classes  
>>> one for each of the categories we are talking about (as listed  
>>> here http://rs.tdwg.org/ontology/voc/TDMTerm)?
>>>
>>> So we would have
>>>
>>>  <owl:Class rdf:ID="EvolutionInfoItem">
>>> 	<rdfs:subClassOf rdf:resource="http://rs.tdwg.org/ontology/ 
>>> Base#InfoItem"/>
>>> </owl:Class>
>>>
>>> and
>>>
>>>  <owl:Class rdf:ID="BehaviourInfoItem">
>>> 	<rdfs:subClassOf rdf:resource="http://rs.tdwg.org/ontology/ 
>>> Base#InfoItem"/>
>>> </owl:Class>
>>>
>>> etc.
>> yes. As Renato has mentioned in a related parallel discussion this  
>> also allows us to create TAPIR models, cause an XML schema for  
>> these classes would have different element names and thus become  
>> mappable easily.
>>
>>
>>>
>>> And that instance data would look like this:
>>>
>>> <bii:BehaviourInfoItem>
>>> 	<ii:hasContent>Some stuff about behaviour</ii:hasContent>
>>> </bii:Behaviour>
>> yes, thats what I was thinking of
>>
>>
>>> If you had evolutionary-behaviour data you might do this
>>>
>>> <bii:BehaviourInfoItem rdf:about="1233">
>>> 	<rdf:type rdf:resource="http://rs.tdwg.org/ontology/voc/ 
>>> EvolutionInfoItem#EvolutionInfoItem"/>
>>> 	<ii:hasContent>Some stuff about evolution of behaviour</ 
>>> ii:hasContent>
>>> </bii:Behaviour>
>> I dont understand your intention here. If you want a more specific  
>> infoitem about evolutionary-behaviour why not define a new class?
>>
>> <eii:EvolutionInfoItem rdf:about="1233">
>> 	<ii:hasContent>Some stuff about evolution of behaviour</ 
>> ii:hasContent>
>> </eii:EvolutionInfoItem >
>>
>>
>>> Here we say that the thing 1233 is an instance of both classes.
>>>
>>> The same instance data the way we came up with it at the meeting  
>>> would look like this:
>>>
>>> <tdm:InfoItem>
>>> 	<tdm:category rdf:resource="http://rs.tdwg.org/ontology/voc/ 
>>> TDMTerm#Evolution"/>
>>> 	<tdm:category rdf:resource="http://rs.tdwg.org/ontology/voc/ 
>>> TDMTerm#Behaviour"/>
>>> 	<tdm:hasContent>Some stuff about evolution of behaviour</ 
>>> tdm:hasContent>
>>> </tdm:InfoItem>
>> alright, so one idea about having a category property is that it  
>> allows you to "tag" one fact (infoitem) with several categories.  
>> Is that a requirement you had in mind when designing TDM?
>>
>>
>>> The attraction of doing it this way to me (and I think Donald  
>>> suggested it) was that it is easy to write a client that will  
>>> digest InfoItems without knowing what they are. If the client  
>>> hadn't heard of Behaviour it could do nothing with a class based  
>>> examples unless it was capable of exploring the class hierarchy  
>>> and finding something it did know about and even then there may  
>>> have been restrictions on the properties that it didn't  
>>> understand. Effectively every client would need to know OWL.
>> well, not really. If all InfoItem instances are bundled through  
>> the TDMClass, you know all of the instances in there are  
>> InfoItems. And I cant see a difference for an ignorant application  
>> in not understanding the class name or not understanding the  
>> category class. For OWL aware applications on the other hand this  
>> gives extra knowledge which is not as easy to get if you have to  
>> understand the categories "domain ontology" / controlled  
>> vocabulary (Because you would need to know OWL AND know how to  
>> interpret the non-OWL category property)
>>
>>> If an application (particular thematic network) really did want a  
>>> BehaviourInfoItem class it could define one itself.
>>>
>>> <owl:Class rdf:ID="BehaviourInfoItem">
>>>   <rdfs:subClassOf rdf:resource="http://rs.tdwg.org/ontology/ 
>>> Base#InfoItem"/>
>>>   <owl:equivalentClass>
>>>     <owl:Restriction>
>>>       <owl:onProperty rdf:resource="tdm:category" />
>>>       <owl:hasValue rdf:resource="http://rs.tdwg.org/ontology/voc/ 
>>> TDMTerm#Behaviour" />
>>>     </owl:Restriction>
>>>   </owl:equivalentClass>
>>> </owl:Class>
>>>
>>> Which I believe would give the same inferences that would be  
>>> found by going with subclasses (though I am no expert).
>>>
>>> The important thing is that we keep the instance data as simple  
>>> and stable as possible and impose meaning later.
>>>
>>> If we were working in a pure semantic web world I would be more  
>>> inclined to go down the class based route but we have to also  
>>> deal with instance data as if it were plain XML documents that we  
>>> can use through TAPIR, validate with XML Schema and transform  
>>> with XSLT.
>> exactly this will be a problem if everything is an InfoItem...
>>
>>
>>> We could always change the definitions of the tdm:hasValue  
>>> property (and the others) so that they inherit from a high level  
>>> property. This kind of change is good because it doesn't affect  
>>> the instance data.
>>>
>>> Have I understood your points correctly or have I just gone off  
>>> on a circle explaining something that is completely off track?
>> I think we are one the same track.
>> Thanks for the insight, Roger!
>>
>> Markus
>>
>>
>>
>>>
>>> All the best,
>>>
>>> Roger
>>>
>>>
>>> On 2 May 2007, at 16:21, Markus Döring wrote:
>>>
>>>> Roger,
>>>> thanks for this. The wiki guide really is a good advice and we  
>>>> should probably not use inheritance to model the domain  
>>>> ontology. We might use it carefully for more "technical"  
>>>> decisions like shared "global" properties, e.g. in the case of  
>>>> the Base#DefinedTerm class you created to derive all terms used  
>>>> for a controlled vocabulary from.
>>>>
>>>> But I still believe there is a difference from the Cat example  
>>>> in the wiki and the InfoItem class. The InfoItem class doesn't  
>>>> use concrete properties, like Cat::hasMarkings in your example,  
>>>> but rather uses a very flexible, generic property Info::hasValue  
>>>> or Info::hasContent. And exactly that abstraction makes me feel  
>>>> uncomfortable. We have to use another property "Info::category"  
>>>> to give semantics to the other value/content property. I doubt  
>>>> that any reasoner understands that (even if dont make use of them).
>>>>
>>>> The alternatives in my previous message don't have to use much  
>>>> of inheritance in any case.
>>>> Applying A (using a common InfoItem Base Class) leaves us with  
>>>> the same situation that we have now. Instead of deriving terms  
>>>> in the domain ontology from Base#DefinedTerm we derive them from  
>>>> Base#InfoItem. And voila, we dont need the category property  
>>>> anymore.
>>>>
>>>> Applying "pattern B", i.e. deriving all properties from a base  
>>>> property, doesn't mean we have to use inheritance to model the  
>>>> domain. We can still derive all properties from a basic hasFact  
>>>> property for example. In this case (which I still feel is the  
>>>> most natural way of doing this) hasSize and hasDescription exist  
>>>> in parallel, but you would at least know they are a property of  
>>>> a taxon and that they have a value (either free text or a term  
>>>> from a list, using datatype or object property respectively). We  
>>>> would use inheritance mainly as a "technical" mean and not to  
>>>> model a hierarchy of properties for a taxon.
>>>>
>>>> --
>>>> Markus
>>>>
>>>>
>>>>
>>>> On 24.04.2007, at 18:54, Roger Hyam wrote:
>>>>
>>>>>
>>>>> Hi Markus,
>>>>>
>>>>> I am glad you like it. Any resemblance to the Fact stuff in  
>>>>> ABCD is purely accidental ;).
>>>>>
>>>>>> (1) The first simple question is why we need a TDM class at  
>>>>>> all. Wouldn't it be sufficient to add an aboutTaxon property  
>>>>>> to the InfoItem class?
>>>>>
>>>>> This is the easy question. We need a container for sets of  
>>>>> InfoItems so that they can all be tagged with the same  
>>>>> metadata. Principle use case might be to get a set of info  
>>>>> about a particular taxon from a single provider and render it  
>>>>> as a web page.
>>>>>
>>>>>> (2) If I understand TDM correctly, the TDM InformationItem  
>>>>>> class is kept independent from the real "domain" ontology (the  
>>>>>> fact category), which is linked from an InfoItem instance via  
>>>>>> the category property. On the other hand the property category  
>>>>>> is not part of the OWL language, so no reasoner will  
>>>>>> understand that the hasValue/hasContent property of an  
>>>>>> InfoItem instance really belongs to the domain ontology class.
>>>>>
>>>>> This raises a general modeling point and makes me realize that  
>>>>> we haven't written it down anywhere. I spent some time talking  
>>>>> to Rob Gales about it last year. I have written a wiki page  
>>>>> that I hope explains it.
>>>>>
>>>>> http://wiki.tdwg.org/twiki/bin/view/TAG/SubclassOrNot
>>>>>
>>>>> Please take a look, edit and feedback - perhaps to the TAG list.
>>>>>
>>>>> Subclassing properties is probably out as we have to allow for  
>>>>> naive implementations wherever possible.
>>>>>
>>>>> This may be way too techie for most of the audience. If we did  
>>>>> change the way we modeled it it may not have big  
>>>>> implementations for the 'domain experts' unless we ask them to  
>>>>> produce a class hierarchy - which may slow them down.
>>>>>
>>>>> Hope this helps,
>>>>>
>>>>> Roger
>>>>>
>>>>>
>>>>>
>>>>> On 23 Apr 2007, at 16:52, Markus Döring wrote:
>>>>>
>>>>>> Dear all,
>>>>>> first of all I'd like to thank you guys for coming up with a  
>>>>>> nice RDF ontology for TDM!
>>>>>>
>>>>>> When taking a first look at it I couldn't exactly understand  
>>>>>> some modelling decisions though, so I would be happy if  
>>>>>> someone could shed light on my questions below that are based  
>>>>>> on this ontology: http://rs.tdwg.org/ontology/voc/ 
>>>>>> TaxonDataModel.rdf
>>>>>>
>>>>>> I am still new to OWL ontologies, so I hope the following  
>>>>>> questions do not sound totally stupid.
>>>>>>
>>>>>>
>>>>>> (1) The first simple question is why we need a TDM class at  
>>>>>> all. Wouldn't it be sufficient to add an aboutTaxon property  
>>>>>> to the InfoItem class?
>>>>>>
>>>>>>
>>>>>> (2) If I understand TDM correctly, the TDM InformationItem  
>>>>>> class is kept independent from the real "domain" ontology (the  
>>>>>> fact category), which is linked from an InfoItem instance via  
>>>>>> the category property. On the other hand the property category  
>>>>>> is not part of the OWL language, so no reasoner will  
>>>>>> understand that the hasValue/hasContent property of an  
>>>>>> InfoItem instance really belongs to the domain ontology class.
>>>>>>
>>>>>> I can see two alternatives to this, so Im curious to know what  
>>>>>> you think about them. In case you have discussed them already,  
>>>>>> could someone explain to me why they were considered less  
>>>>>> appropriate?
>>>>>>
>>>>>>
>>>>>> Alternative A) - InfoItem Base Class
>>>>>> Derive all domain ontology classes from an InfoItem base class  
>>>>>> with properties (context, hasValue, ...)
>>>>>>
>>>>>>
>>>>>> Alternative B) - TCS Object Properties
>>>>>> Define the domain ontology mainly as object properties  
>>>>>> (similar to dublin core) that have a  
>>>>>> rdfs:domain=tcs:TaxonConcept and an rdfs:range that points to  
>>>>>> an InfoItem like base class that allows for context. We can  
>>>>>> define different InfoItem derived classes as ranges for some  
>>>>>> properties, allowing us to enforce free text (hasValue) or  
>>>>>> some specific controlled vocabulary (hasContent). The  
>>>>>> properties can also use inheritance to allow for broad and  
>>>>>> specialized descriptors. The TaxonConcept class already has  
>>>>>> some properties i.e. describedBy (for descriptive text) and  
>>>>>> circumscribedBy (specimen). These two properties could already  
>>>>>> serve as base properties to create specialised descriptors via  
>>>>>> rdfs:subPropertyOf.
>>>>>>
>>>>>>
>>>>>> Well, as always there are many ways of doing the same thing.
>>>>>> Best wishes
>>>>>> Markus
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Markus Döring
>>>>>>   Botanic Garden and Botanical Museum Berlin Dahlem,
>>>>>>   Dept. of Biodiversity Informatics
>>>>>>   Königin-Luise-Str. 6-8, D-14191 Berlin
>>>>>>   +49 (30) 83850-284
>>>>>>   m.doering at bgbm.org
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>
> _______________________________________________
> tdwg-tag mailing list
> tdwg-tag at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tag




More information about the tdwg-tag mailing list