Ooo! I knew Bob would have something to say about this! Welcome back Bob, by the way.
I should probably let Cam comment on this, but he's going to be in the field without Internet for about a week, so I'll try to respond. We were very conscious of the connection between domain/range declarations, how they might cause unintended rdf:type assignments (see http://code.google.com/p/darwin-sw/wiki/ClassesAndTypes for a more on this), and of the history of this issue in the discussion that lead up to the Darwin Core standard. In fact, the issue of whether to have any range and domain declarations was probably the first thing that Cam and I discussed when we started the project and we initially didn't agree on it. In the end, we decided that DSW is different from DwC in an important way. DwC is designed to be a general purpose vocabulary where it is OK to use a term like dwc:genus as a property for pretty much anything that you want. DSW is NOT a general-purpose ontology. It is an ontology for people who see the biodiversity informatics realm according to the (maybe) consensus that came out in the discussion last Oct/Nov which we have attempted to embody in the DSW ontology. So people who see that realm in a different way SHOULD NOT use DSW. They should create their own ontology and describe the biodiversity informatics world the way that they see it. I see an RDF representation of metadata as fundamentally different from a delineated text or generic xml file. In those latter cases, there is generally an agreement between the sender and receiver about what the content of the file "means" - perhaps articulated formally in a schema or just informally in an email. In the case of RDF, the sender does not necessarily know who the receiver will be, and the receiver does not know what the content of the RDF means unless the sender uses classes and properties that are widely understood. That is why I am somewhat skeptical about just marking metadata up as RDF any old way and throwing it out into the cloud. What good does that do if "the cloud" has no idea what the metadata means and can't interpret it? DWC is an attempt to create RDF that is consistent with (what we believe to be) the way the TDWG community "understands" the DwC classes and their relationships (at least as far as we could infer it from the tdwg-content discussion from September to now).
I realize that this closes the world a lot. But a major part of what Cam and I were trying to do was to create a situation where aggregators who want to collect metadata on resources identified by GUIDs can feel reasonably confident that particular rdf type's of resources from diverse sources actually "mean" the same thing to all of the providers. The fact that using the wrong object properties to describe a resource might be "naughty" will (hopefully) get people to think carefully about the (rdf:)type of the thing they are trying to describe.
Most of the object properties that Cam and I defined in DSW were intended specifically to describe the relationship between two classes in the ontology and nothing else. For example dsw:idBasedOn is supposed to describe the Token (evidence) on which an Identification is based. We don't intend for that term to be used creatively by people outside the biodiversity informatics community. We intend for it to be used by people who want to clearly state that a particular object resource served as evidence for the subject Identification. Hence, the domain is dwc:Identification and the range is dsw:Token . Any other use is abuse, and if the user applies it as a property of a cat, then that user is stating that the cat is a dwc:Identification. That is silly and anyone that careless should not use DSW. On the other hand, saying that dsw:idBasedOn has the range dsw:Token allows me to indicate that basis of an Identification (whose evidence was described by somebody else) is rdf:type dsw:Token, even if I myself did not write the RDF description of that evidence myself. Likewise, applying the term dsw:locatedAt (which has the range dcterms:Location) to a place described by somebody else allows me to assert that the object is a dcterms:Location even if the person who created the RDF description of the place chose not to explicitly rdf:type the place as a dcterms:Location.
In contrast, property terms from the DwC standard (which we feel are in most cases appropriate for use as data rather than object properties) do not necessarily have clear domains and ranges. Some do (e.g. dwc:identifiedBy clearly has the domain dwc:Identification) but others don't. Does dwc:genus have the domain dwc:Taxon or tn:TaxonName? Can it be used as a property of a dwc:Identification? I don't know. Is there a single class that is the domain of dwc:catalogNumber? Probably not. Is there ANY existing class that is the domain of dwc:establishmentMeans? Based on the many posts about that term, I doubt it. The point is that I don't think that there is (yet at least) a consensus on what the domain and range is for many DwC property terms. So Cam and I did not feel comfortable assigning domains and ranges for them nor dooming people to the sin of naughtiness if they disagreed with us.
I hope this explains our reasoning somewhat. There is, of course, the possibility that we just made a bad call and that none of our object properties should have ranges and domains. Perhaps someone can make counterarguments to what I've just said.
Steve
P.S. BTW it makes me feel so much better to know that I'm not the only one to slip and say "rdfs:type"!. Thanks for the careful read, Bob, and we're looking forward to you discovering misstatements and inaccuracies which are undoubtedly present in the DSW documentation.
Bob Morris wrote:
I´ve been in Ecuador for two weeks with only a few hours of internet contact. Flying from Quite tonight. So I´ve only scanned the Design Document and may have done so too hastily. I´ll have more time next week to look at it and the ontology. However, meanwhile, I´ve posted the following unsurprizing comment there, and hope that anyone who responds here will also respond there.
It´s a little hard to understand why DSW object properteries should have domains but data properties should not. Keep in mind that the formal semantics of rdfs:domain is that if rdfs:domain P C then any use a P x forces a to have rdfs:type C. This constrains the extensibility of P and probably of the use of other ontologies where P might be useful. Specifying domains tends to close the world somewhat, and I see no advantage to that....Indeed, this is more naughty than anti'naughty sensu Bob Morris.
--Bob Morris
On Wed, Apr 27, 2011 at 9:01 PM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
Dear colleagues,
With the exciting development of Semantic Web technologies, many of us already need a way to consistently express DwC in RDF. In particular, we need it to meet the requirements for GUID resolution (as expressed in the TDWG applicability guide) and to be able to share and aggregate diverse kinds of biodiversity metadata in the Linked Open Data world.
After several months of development, stimulated by the tdwg-content discussions of last Fall, we would like to offer an ontology for consideration, based on Darwin Core terms:
Darwin-SW: General site: http://code.google.com/p/darwin-sw/ Using namespace dsw = "http://purl.org/dsw/"
It is a candidate for general usage, but we do not claim it is _the_ solution, and greatly respect the efforts by others to develop similar ontologies, from which we have learned much. However, for DSW, we wanted to use existing DwC terms for classes and data properties whenever possible and only create new terms when there were no existing terms that would do the job. We did feel that there was a need for clarity in how resources should be typed (i.e. rdf:type property) and for object properties that expressed the relationships among classes unambiguously. Please see the Rationale, DesignPrinciples and ClassesAndTypes wiki pages at the above address.
In the ontology, we sought to embody relationships among classes based on our perception of the community consensus of what the classes represent and how they are related to each other, as expressed in posts to the tdwg-content list. Thus each class is documented carefully on the wiki with hyperlinked references to specific tdwg-content posts. We also sought to clarify or resolve issues that were raised in the list discussion, most notably the relationship among Occurrences and the evidence that documents them (i.e. tokens), and the role of "individuals". See wiki pages for the Token, Occurrence, and IndividualOrganism classes.
The ontology is now in use at http://bioimages.vanderbilt.edu/, and we intend to use it more widely. We would value your opinions on the fitness of this ontology as a general solution for consistently expressing DwC concepts in RDF.
Sincerely,
Steve Baskauf and Cam Webb steve.baskauf@Vanderbilt.Edu cwebb@oeb.harvard.edu
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content