Because I am an RDF/OWL novice, I will not say much about how I think Jonathan/Bob's comments apply to the discussion about the scope of Individual. But I think that the suggestion that there may be stricter and broader uses of an Individual class probably describes the root of the disagreement pretty well. I have a particular, narrow use-case in mind (facilitating resampling and inferring duplicates using some kind of taxonomically homogeneous entity). Rich wants a broader interpretation based on ideas of what an individual means in various contexts. I think this difference in outlook is reflected in this statement from Rich's last response:
'We can argue about the properties and tokens later; first we need to nail down the "essence" of an Individual.'
I actually disagree strongly with this statement. I have tried to stay out of the current thread about the future course of the TDWG ontology because it isn't my something that I know much about. But I think I am leaning to the side of those who suggest that we create use cases first and then see how the ontology can be developed to facilitate those use cases. I am actively using the class Individual in RDF way I have defined it in my proposal. I know of at least one other person who plans to do so as well. It is not clear to me what the use case is for doing what Rich wants: combining what I've called the "token" aspect of individuals with the "resampling" aspect. It may be that there is such a use, but I'd like to see how it will work - in particular how it will work in RDF without "breaking" the use that I need for the Individual class. If it is possible to combine the two aspects of individuals, perhaps that might be done in a "lax" definition (using Paul's term) that reflects the "essence" of an individual. Unfortunately, figuring out what the "essence" is of an individual is more difficult than showing how one plans to use the term.
Steve
Bob Morris wrote:
Jonathan Reese, an employee of the Science Commons and TDWG member (and who knows way more about semantic web than I do) recently sent me this. I copy it here with his permission. Each of the paragraphs seems to me to be germane in different ways to the discussions about what should be an Individual. For those not deep into RDF, for the word "axiom", you could loosely understand "rule", although that term also has technical meaning that is sometimes a little different. Jonathan raises an important use case in the second paragraph, which is data quality control. That's a topic of interest to many, but especially those following the new Annotation Interest Group. Originally, this was part of a discussion we had about my favorite hobby horse, rdfs:domain. He is not on my side. When people who know more than I do about something are skeptical of my arguments about it, I usually suspend disbelief and temporarily adopt their position.
Jonathan's first point is pretty much what Paul Murray observed yesterday in response to a question of Kevin Richards.
"(a) subclassing is the way in RDFS or OWL you would connect the more specific to the less specific, so that you can apply general theorems to a more specific entity. That is, a well-documented data set would be rendered using classes and properties that were very specific so as to not lose information, and then could be merged with a badly-documented data set by relaxing to more general classes and properties using subclass and subproperty knowledge.
(b) axioms (i.e. specificity) are valuable not only for expressing operational and inferential semantics, but also for "sanity checking" e.g. consistency, satisfiability, Clark/Parsia integrity checks ( http://clarkparsia.com/pellet/icv/ ), and similar. Being able to detect ill-formed inputs is incredibly valuable.
People talk past one another because there are many distinct use cases for RDF and assumptions are rarely surfaced. For L(O)D, you're interested in making lots of links with little effort. Semantics is the enemy because it drives up costs. For semantic web, on the other hand, you're interested in semantics, i.e. understanding and documenting the import of what's asserted and making a best effort to only assert things that are true, even in the presence of open world assumption and data set extensibility. Semantics is expensive because it requires real thought and often a lot of reverse engineering. People coming from these two places will never be able to get along."
---Jonathan Rees in email to Bob Morris
Bob Morris