[tdwg-content] domains of object properties. was Re: DwC for the semantic web

Steven J. Baskauf steve.baskauf at vanderbilt.edu
Fri Apr 29 05:42:16 CEST 2011

Ooo!  I knew Bob would have something to say about this!  Welcome back 
Bob, by the way.

I should probably let Cam comment on this, but he's going to be in the 
field without Internet for about a week, so I'll try to respond.  We 
were very conscious of the connection between domain/range declarations, 
how they might cause unintended rdf:type assignments (see 
http://code.google.com/p/darwin-sw/wiki/ClassesAndTypes for a more on 
this), and of the history of this issue in the discussion that lead up 
to the Darwin Core standard.  In fact, the issue of whether to have any 
range and domain declarations was probably the first thing that Cam and 
I discussed when we started the project and we initially didn't agree on 
it.  In the end, we decided that DSW is different from DwC in an 
important way.  DwC is designed to be a general purpose vocabulary where 
it is OK to use a term like dwc:genus as a property for pretty much 
anything that you want.  DSW is NOT a general-purpose ontology.  It is 
an ontology for people who see the biodiversity informatics realm 
according to the (maybe) consensus that came out in the discussion last 
Oct/Nov which we have attempted to embody in the DSW ontology.  So 
people who see that realm in a different way SHOULD NOT use DSW.  They 
should create their own ontology and describe the biodiversity 
informatics world the way that they see it.   I see an RDF 
representation of metadata as fundamentally different from a delineated 
text or generic xml file.  In those latter cases, there is generally an 
agreement between the sender and receiver about what the content of the 
file "means" - perhaps articulated formally in a schema or just 
informally in an email.  In the case of RDF, the sender does not 
necessarily know who the receiver will be, and the receiver does not 
know what the content of the RDF means unless the sender uses classes 
and properties that are widely understood.  That is why I am somewhat 
skeptical about just marking metadata up as RDF any old way and throwing 
it out into the cloud.  What good does that do if "the cloud" has no 
idea what the metadata means and can't interpret it?  DWC is an attempt 
to create RDF that is consistent with (what we believe to be) the way 
the TDWG community "understands" the DwC classes and their relationships 
(at least as far as we could infer it from the tdwg-content discussion 
from September to now).

 I realize that this closes the world a lot.  But a major part of what 
Cam and I were trying to do was to create a situation where aggregators 
who want to collect metadata on resources identified by GUIDs can feel 
reasonably confident that particular rdf type's of resources from 
diverse sources actually "mean" the same thing to all of the providers.  
The fact that using the wrong object properties to describe a resource 
might be "naughty" will (hopefully) get people to think carefully about 
the (rdf:)type of the thing they are trying to describe. 

Most of the object properties that Cam and I defined in DSW were 
intended specifically to describe the relationship between two classes 
in the ontology and nothing else.  For example dsw:idBasedOn is supposed 
to describe the Token (evidence) on which an Identification is based.  
We don't intend for that term to be used creatively by people outside 
the biodiversity informatics community.  We intend for it to be used by 
people who want to clearly state that a particular object resource 
served as evidence for the subject Identification.  Hence, the domain is 
dwc:Identification and the range is dsw:Token .  Any other use is abuse, 
and if the user applies it as a property of a cat, then that user is 
stating that the cat is a dwc:Identification.  That is silly and anyone 
that careless should not use DSW.  On the other hand, saying that 
dsw:idBasedOn has the range dsw:Token allows me to indicate that basis 
of an Identification (whose evidence was described by somebody else) is 
rdf:type dsw:Token, even if I myself did not write the RDF description 
of that evidence myself.  Likewise, applying the term dsw:locatedAt 
(which has the range dcterms:Location) to a place described by somebody 
else allows me to assert that the object is a dcterms:Location even if 
the person who created the RDF description of the place chose not to 
explicitly rdf:type the place as a dcterms:Location. 

In contrast, property terms from the DwC standard (which we feel are in 
most cases appropriate for use as data rather than object properties) do 
not necessarily have clear domains and ranges.  Some do (e.g. 
dwc:identifiedBy clearly has the domain dwc:Identification) but others 
don't.  Does dwc:genus have the domain dwc:Taxon or tn:TaxonName?  Can 
it be used as a property of a dwc:Identification?  I don't know.  Is 
there a single class that is the domain of dwc:catalogNumber?  Probably 
not.  Is there ANY existing class that is the domain of 
dwc:establishmentMeans?  Based on the many posts about that term, I 
doubt it.  The point is that I don't think that there is (yet at least) 
a consensus on what the domain and range is for many DwC property 
terms.  So Cam and I did not feel comfortable assigning domains and 
ranges for them nor dooming people to the sin of naughtiness if they 
disagreed with us.

I hope this explains our reasoning somewhat.  There is, of course, the 
possibility that we just made a bad call and that none of our object 
properties should have ranges and domains.  Perhaps someone can make 
counterarguments to what I've just said.


P.S. BTW it makes me feel so much better to know that I'm not the only 
one to slip and say "rdfs:type"!.   Thanks for the careful read, Bob, 
and we're looking forward to you discovering misstatements and 
inaccuracies which are undoubtedly present in the DSW documentation.

Bob Morris wrote:
> I´ve been in Ecuador for two weeks with only a few hours of internet
> contact. Flying from Quite tonight.  So I´ve only scanned the Design
> Document and may have done so too hastily.  I´ll have more time next
> week to look at it and the ontology.  However, meanwhile, I´ve posted
> the following unsurprizing comment there, and hope that anyone who
> responds here will also respond there.
> -----
> It´s a little hard to understand why DSW object properteries should
> have domains but data properties should not. Keep in mind that the
> formal semantics of rdfs:domain is that if rdfs:domain P C then any
> use a P x forces a to have rdfs:type C. This constrains the
> extensibility of P and probably of the use of other ontologies where P
> might be useful. Specifying domains tends to close the world somewhat,
> and I see no advantage to that....Indeed, this is more naughty than
> anti'naughty sensu Bob Morris.
> -----
> --Bob Morris
> On Wed, Apr 27, 2011 at 9:01 PM, Steve Baskauf
> <steve.baskauf at vanderbilt.edu> wrote:
>> Dear colleagues,
>> With the exciting development of Semantic Web technologies, many of us already need a way to consistently express DwC in RDF.  In particular, we need it to meet the requirements for GUID resolution (as expressed in the TDWG applicability guide) and to be able to share and aggregate diverse kinds of biodiversity metadata in the Linked Open Data world.
>> After several months of development, stimulated by the tdwg-content discussions of last Fall, we would like to offer an
>> ontology for consideration, based on Darwin Core terms:
>>   Darwin-SW:
>>     General site: http://code.google.com/p/darwin-sw/
>>     Using namespace dsw = "http://purl.org/dsw/"
>> It is a candidate for general usage, but we do not claim it is _the_ solution, and greatly respect the efforts by others to develop similar ontologies, from which we have learned much. However, for DSW, we wanted to use existing DwC terms for classes and data properties whenever possible and only create new terms when there were no existing terms that would do the job.  We did feel that there was a need for clarity in how resources should be typed (i.e. rdf:type property) and for object properties that expressed the relationships among classes unambiguously. Please see the Rationale, DesignPrinciples and ClassesAndTypes wiki pages at the above address.
>> In the ontology, we sought to embody relationships among classes based on our perception of the community consensus of what the classes represent and how they are related to each other, as expressed in posts to the tdwg-content list.  Thus each class is documented carefully on the wiki with hyperlinked references to specific tdwg-content posts. We also sought
>> to clarify or resolve issues that were raised in the list discussion, most notably the relationship among Occurrences and the evidence that documents them (i.e. tokens), and the role of "individuals".  See wiki pages for the Token, Occurrence, and IndividualOrganism classes.
>> The ontology is now in use at http://bioimages.vanderbilt.edu/, and we intend to use it more widely.  We would value your opinions on the fitness of this ontology as a general solution for consistently expressing DwC concepts in RDF.
>> Sincerely,
>> Steve Baskauf and Cam Webb
>> steve.baskauf at Vanderbilt.Edu
>> cwebb at oeb.harvard.edu
>> _______________________________________________
>> tdwg-content mailing list
>> tdwg-content at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content

Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110428/39e138a5/attachment.html 

More information about the tdwg-content mailing list