[tdwg-content] domains of object properties. was Re: DwC for the semantic web
Steven J. Baskauf
steve.baskauf at vanderbilt.edu
Fri Apr 29 05:42:16 CEST 2011
Ooo! I knew Bob would have something to say about this! Welcome back
Bob, by the way.
I should probably let Cam comment on this, but he's going to be in the
field without Internet for about a week, so I'll try to respond. We
were very conscious of the connection between domain/range declarations,
how they might cause unintended rdf:type assignments (see
http://code.google.com/p/darwin-sw/wiki/ClassesAndTypes for a more on
this), and of the history of this issue in the discussion that lead up
to the Darwin Core standard. In fact, the issue of whether to have any
range and domain declarations was probably the first thing that Cam and
I discussed when we started the project and we initially didn't agree on
it. In the end, we decided that DSW is different from DwC in an
important way. DwC is designed to be a general purpose vocabulary where
it is OK to use a term like dwc:genus as a property for pretty much
anything that you want. DSW is NOT a general-purpose ontology. It is
an ontology for people who see the biodiversity informatics realm
according to the (maybe) consensus that came out in the discussion last
Oct/Nov which we have attempted to embody in the DSW ontology. So
people who see that realm in a different way SHOULD NOT use DSW. They
should create their own ontology and describe the biodiversity
informatics world the way that they see it. I see an RDF
representation of metadata as fundamentally different from a delineated
text or generic xml file. In those latter cases, there is generally an
agreement between the sender and receiver about what the content of the
file "means" - perhaps articulated formally in a schema or just
informally in an email. In the case of RDF, the sender does not
necessarily know who the receiver will be, and the receiver does not
know what the content of the RDF means unless the sender uses classes
and properties that are widely understood. That is why I am somewhat
skeptical about just marking metadata up as RDF any old way and throwing
it out into the cloud. What good does that do if "the cloud" has no
idea what the metadata means and can't interpret it? DWC is an attempt
to create RDF that is consistent with (what we believe to be) the way
the TDWG community "understands" the DwC classes and their relationships
(at least as far as we could infer it from the tdwg-content discussion
from September to now).
I realize that this closes the world a lot. But a major part of what
Cam and I were trying to do was to create a situation where aggregators
who want to collect metadata on resources identified by GUIDs can feel
reasonably confident that particular rdf type's of resources from
diverse sources actually "mean" the same thing to all of the providers.
The fact that using the wrong object properties to describe a resource
might be "naughty" will (hopefully) get people to think carefully about
the (rdf:)type of the thing they are trying to describe.
Most of the object properties that Cam and I defined in DSW were
intended specifically to describe the relationship between two classes
in the ontology and nothing else. For example dsw:idBasedOn is supposed
to describe the Token (evidence) on which an Identification is based.
We don't intend for that term to be used creatively by people outside
the biodiversity informatics community. We intend for it to be used by
people who want to clearly state that a particular object resource
served as evidence for the subject Identification. Hence, the domain is
dwc:Identification and the range is dsw:Token . Any other use is abuse,
and if the user applies it as a property of a cat, then that user is
stating that the cat is a dwc:Identification. That is silly and anyone
that careless should not use DSW. On the other hand, saying that
dsw:idBasedOn has the range dsw:Token allows me to indicate that basis
of an Identification (whose evidence was described by somebody else) is
rdf:type dsw:Token, even if I myself did not write the RDF description
of that evidence myself. Likewise, applying the term dsw:locatedAt
(which has the range dcterms:Location) to a place described by somebody
else allows me to assert that the object is a dcterms:Location even if
the person who created the RDF description of the place chose not to
explicitly rdf:type the place as a dcterms:Location.
In contrast, property terms from the DwC standard (which we feel are in
most cases appropriate for use as data rather than object properties) do
not necessarily have clear domains and ranges. Some do (e.g.
dwc:identifiedBy clearly has the domain dwc:Identification) but others
don't. Does dwc:genus have the domain dwc:Taxon or tn:TaxonName? Can
it be used as a property of a dwc:Identification? I don't know. Is
there a single class that is the domain of dwc:catalogNumber? Probably
not. Is there ANY existing class that is the domain of
dwc:establishmentMeans? Based on the many posts about that term, I
doubt it. The point is that I don't think that there is (yet at least)
a consensus on what the domain and range is for many DwC property
terms. So Cam and I did not feel comfortable assigning domains and
ranges for them nor dooming people to the sin of naughtiness if they
disagreed with us.
I hope this explains our reasoning somewhat. There is, of course, the
possibility that we just made a bad call and that none of our object
properties should have ranges and domains. Perhaps someone can make
counterarguments to what I've just said.
P.S. BTW it makes me feel so much better to know that I'm not the only
one to slip and say "rdfs:type"!. Thanks for the careful read, Bob,
and we're looking forward to you discovering misstatements and
inaccuracies which are undoubtedly present in the DSW documentation.
Bob Morris wrote:
> I´ve been in Ecuador for two weeks with only a few hours of internet
> contact. Flying from Quite tonight. So I´ve only scanned the Design
> Document and may have done so too hastily. I´ll have more time next
> week to look at it and the ontology. However, meanwhile, I´ve posted
> the following unsurprizing comment there, and hope that anyone who
> responds here will also respond there.
> It´s a little hard to understand why DSW object properteries should
> have domains but data properties should not. Keep in mind that the
> formal semantics of rdfs:domain is that if rdfs:domain P C then any
> use a P x forces a to have rdfs:type C. This constrains the
> extensibility of P and probably of the use of other ontologies where P
> might be useful. Specifying domains tends to close the world somewhat,
> and I see no advantage to that....Indeed, this is more naughty than
> anti'naughty sensu Bob Morris.
> --Bob Morris
> On Wed, Apr 27, 2011 at 9:01 PM, Steve Baskauf
> <steve.baskauf at vanderbilt.edu> wrote:
>> Dear colleagues,
>> With the exciting development of Semantic Web technologies, many of us already need a way to consistently express DwC in RDF. In particular, we need it to meet the requirements for GUID resolution (as expressed in the TDWG applicability guide) and to be able to share and aggregate diverse kinds of biodiversity metadata in the Linked Open Data world.
>> After several months of development, stimulated by the tdwg-content discussions of last Fall, we would like to offer an
>> ontology for consideration, based on Darwin Core terms:
>> General site: http://code.google.com/p/darwin-sw/
>> Using namespace dsw = "http://purl.org/dsw/"
>> It is a candidate for general usage, but we do not claim it is _the_ solution, and greatly respect the efforts by others to develop similar ontologies, from which we have learned much. However, for DSW, we wanted to use existing DwC terms for classes and data properties whenever possible and only create new terms when there were no existing terms that would do the job. We did feel that there was a need for clarity in how resources should be typed (i.e. rdf:type property) and for object properties that expressed the relationships among classes unambiguously. Please see the Rationale, DesignPrinciples and ClassesAndTypes wiki pages at the above address.
>> In the ontology, we sought to embody relationships among classes based on our perception of the community consensus of what the classes represent and how they are related to each other, as expressed in posts to the tdwg-content list. Thus each class is documented carefully on the wiki with hyperlinked references to specific tdwg-content posts. We also sought
>> to clarify or resolve issues that were raised in the list discussion, most notably the relationship among Occurrences and the evidence that documents them (i.e. tokens), and the role of "individuals". See wiki pages for the Token, Occurrence, and IndividualOrganism classes.
>> The ontology is now in use at http://bioimages.vanderbilt.edu/, and we intend to use it more widely. We would value your opinions on the fitness of this ontology as a general solution for consistently expressing DwC concepts in RDF.
>> Steve Baskauf and Cam Webb
>> steve.baskauf at Vanderbilt.Edu
>> cwebb at oeb.harvard.edu
>> tdwg-content mailing list
>> tdwg-content at lists.tdwg.org
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the tdwg-content