Thanks for this helpful suggestion, Paul.  I have a question related to what you said.  As a general practice, DSW "imported" class terms from DwC to use as the basic classes in the ontology.  This seemed like a relatively "safe" thing to do, since the RDF definitions of the class terms in DwC don't really say anything restrictive about how they are used.  We then basically tried to describe (using properties) how those classes were connected and put in some restrictions intended to prevent people from calling apples oranges (range/domain restrictions coupled with disjoin declarations).  However, we went beyond that in the Taxon class.  We declared that dwc:Taxon was equivalent to tc:Taxon (Taxon in the TDWG ontology, which is declared within the TDWG ontology to be the same thing as tc:TaxonConcept).  Our motivation was to try to add some clarity to what exactly the dwc:Taxon class was (that wasn't exactly clear to us) and also to avoid having to try to describe the properties of the Taxon class since they were already described (to some extent) within the TDWG ontology.  However, from what you have said below, it sounds like it might have been a better idea to have declared the equivalency of dwc:Taxon and tc:Taxon outside of the main ontology document to allow further development of the Taxon class within DSW without causing a conflict with the description of Taxon within the TDWG ontology. 

I guess the wisdom of doing this partially depends on the future of the TDWG ontology, which is not within our (Cam and my) hands.  We didn't really want to get into describing Taxon (not our area of specialty) and the tc:Taxon class and terms were in use by some people (you, for example).  But if the TDWG ontology is permanently "frozen" (a.k.a. abandoned), then maybe tying Taxon to it was not wise.  I would be interested in your opinion on this. 

Steve

Paul Murray wrote:

On 05/05/2011, at 1:13 PM, Steve Baskauf wrote:

Peter DeVries wrote:
I also don't seem to understand why if someone can find some missing utility in existing vocabularies, and mints one starting with txn, it is seen by some as an act of heresy, while the minting a new vocabulary starting with dsw is not.


I might jump in and make a technical point about OWL/RDF here.

One of the strengths of OWL is that you can define equivalencies between your own vocabulary and someone else's. You can declare that your own "Taxon" notion is narrower or broader than some other notion of taxon. The power of this is that if someone attempts to reason over some well-known vocabulary and you define semantic relationships, their rules will also reason over your dataset - as far as is possible.

However, what you can do, and what I think would be a very sensible thing to do, is to keep your vocabulary terms and your semantic equivalencies in separate ontology documents. 

For instance:

Lets say you have a predicate "isVoucherFor", and it's quite obvious that it is pretty much the same thing as http://rs.tdwg.org/ontology/voc/Specimen#isVoucherFor, but slightly more specific. For instance, your "isVoucherFor" might apply only to specimens that are RegisteredSpecimens in your vocabulary. It has a range declaration on it. In this case, it seems obvious that your isVoucherFor is a subproperty of the tdwg isVoucherFor. 

I'm suggesting suggest that the declaration of your 'isVoucherFor' probably should include your range specifier (it's part of the way your vocabulary works), but the declaration of the subproperty relationship should be put in some other file/ontology, which would import both your vocabulary and the TDWG vocabulary (or DwC vocabulary, in this case). Any RDF files containing data should probably import your vocabulary but not the equaivalency semantics.

This allows you to "plug in" sets of equivalencies. A sparql query can drag in some data using your vocabulary, and can separately drag in your rules. Or not. That way:

* if the other vocabulary changes in ways that break things, you can define new equivalencies without disturbing an existing ruleset that people might be using
* people who want to use your terms but have their own ideas as to what the equivalencies are can do their own reasoning
* if someone's data uses your terms but using the equivalencies results in inconsistencies, a querent can plug in a stripped-down set of semantics and reason over that
* the act of simply reading and processing your data does not cause the other vocabulary to be dragged into the engine

The ongoing problem is management: what rules are applicable? What rule sets are there? The TDWG site itself (or any webserver anywhere) can serve as an aggregator for bioinformatic RDF/OWL semantics simply by hosting empty ontologies that import others . Thus "Old-tdwg-and-DwC.owl", "DwC-and-taxonconcept(lax).owl", "rulesets-that-pertain-to-occurence.owl", "absolutely-everything-ever.owl", and so on.

The work , of course, is in thrashing out the equivalencies - an ongoing job.

If you have received this transmission in error please notify us immediately by return e-mail and delete all copies. If this e-mail or any attachments have been sent to you in error, that error does not constitute waiver of any confidentiality, privilege or copyright in respect of information in the e-mail or attachments. Please consider the environment before printing this email.


-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu