[tdwg-content] Heretics and illuminati, oh my! [SEC=UNCLASSIFIED]

Paul Murray pmurray at anbg.gov.au
Fri May 6 05:38:04 CEST 2011

On 05/05/2011, at 1:13 PM, Steve Baskauf wrote:

> Peter DeVries wrote:
>> I also don't seem to understand why if someone can find some missing utility in existing vocabularies, and mints one starting with txn, it is seen by some as an act of heresy, while the minting a new vocabulary starting with dsw is not.

I might jump in and make a technical point about OWL/RDF here.

One of the strengths of OWL is that you can define equivalencies between your own vocabulary and someone else's. You can declare that your own "Taxon" notion is narrower or broader than some other notion of taxon. The power of this is that if someone attempts to reason over some well-known vocabulary and you define semantic relationships, their rules will also reason over your dataset - as far as is possible.

However, what you can do, and what I think would be a very sensible thing to do, is to keep your vocabulary terms and your semantic equivalencies in separate ontology documents. 

For instance:

Lets say you have a predicate "isVoucherFor", and it's quite obvious that it is pretty much the same thing as http://rs.tdwg.org/ontology/voc/Specimen#isVoucherFor, but slightly more specific. For instance, your "isVoucherFor" might apply only to specimens that are RegisteredSpecimens in your vocabulary. It has a range declaration on it. In this case, it seems obvious that your isVoucherFor is a subproperty of the tdwg isVoucherFor. 

I'm suggesting suggest that the declaration of your 'isVoucherFor' probably should include your range specifier (it's part of the way your vocabulary works), but the declaration of the subproperty relationship should be put in some other file/ontology, which would import both your vocabulary and the TDWG vocabulary (or DwC vocabulary, in this case). Any RDF files containing data should probably import your vocabulary but not the equaivalency semantics.

This allows you to "plug in" sets of equivalencies. A sparql query can drag in some data using your vocabulary, and can separately drag in your rules. Or not. That way:

* if the other vocabulary changes in ways that break things, you can define new equivalencies without disturbing an existing ruleset that people might be using
* people who want to use your terms but have their own ideas as to what the equivalencies are can do their own reasoning
* if someone's data uses your terms but using the equivalencies results in inconsistencies, a querent can plug in a stripped-down set of semantics and reason over that
* the act of simply reading and processing your data does not cause the other vocabulary to be dragged into the engine

The ongoing problem is management: what rules are applicable? What rule sets are there? The TDWG site itself (or any webserver anywhere) can serve as an aggregator for bioinformatic RDF/OWL semantics simply by hosting empty ontologies that import others . Thus "Old-tdwg-and-DwC.owl", "DwC-and-taxonconcept(lax).owl", "rulesets-that-pertain-to-occurence.owl", "absolutely-everything-ever.owl", and so on.

The work , of course, is in thrashing out the equivalencies - an ongoing job.

If you have received this transmission in error please notify us immediately by return e-mail and delete all copies. If this e-mail or any attachments have been sent to you in error, that error does not constitute waiver of any confidentiality, privilege or copyright in respect of information in the e-mail or attachments.

Please consider the environment before printing this email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110506/1b3a3f15/attachment.html 

More information about the tdwg-content mailing list