I knew I would get in trouble talking about this among experts. :-) Thanks for the correction. I should have said "properties that connect instances of those classes". I think that is what I meant. My point was that in creating a model, one doesn't have to enumerate every particular instance, particularly if there are many of them. One can describe the class in general and let the users create the instances that are appropriate for that class.
On May 3, 2011, at 9:00 PM, Steve Baskauf wrote:
But I was under the impression that one models things by describing classes and the properties that connect them.
In OWL, properties connect instances, not classes. RDF allows metaclasses (things that are classes and instances), but doing this will throw most (all?) reasoners off the track.
Well, I guess I was influenced in my thinking about this by http://wiki.tdwg.org/twiki/bin/view/TAG/SubclassOrNot , particularly the part about the cats, which I can actually understand pretty well. To me, the part of that wiki page that is most relevant to this discussion is:
Classes are (to me) a very different thing than instances of classes. A model containing more than 13.6 million classes is at least 1.9 million times as complicated as a model with 7 classes.
Yes and no. I can model a taxonomy as a subclass hierarchy of classes, or as a property-based (memberOf or some such) hierarchy of individuals that all instantiate a single "Taxon" class. The former isn't 1 million times more complex than the latter. However, they are not identical either, and which approach one chooses has significant consequences for how easy it is to express things about those taxa, and for inferring new things from those with a DL reasoner.
"Whether the subclassing option is preferable to the tagging approach depends on the use of the ontology. The TDWG ontology's principal role is not modeling the entire domain to permit inference but allowing the mark up of data so that it will flow between applications as freely as possible. It has to be something that is easy to map into multiple technologies and something that people can agree on rapidly.
This strongly suggests that the tagging approach should be taken wherever possible. First agree on the basic semantic units and model the rest of the semantics with tagging. Only subclass when absolutely necessary."
The really great thing about this is that I can dodge further responsibility by just blaming my way of thinking on the people who posted that page (Roger Hyam modified by Bob Morris, I think). :-) But seriously, I think that the statement above pretty well summarizes what may be the difference between what Pete and I are saying. My primary concern is to allow "the mark up of data so that it will flow between applications as freely as possible". Pete's point may be to permit inferencing.That was the point I was trying to make (I think).
I would hate to have to draw an RDF graph of that model
I would as much hate to have to draw an RDF graph of 1.7 million instances. The point being, in order to draw a graph of how someone models a domain you don't draw a graph of the entire RDF triple store.
-hilmar
--===========================================================: Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org :===========================================================
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A. delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235 office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707 http://bioimages.vanderbilt.edu