[tdwg-tag] class design, generalization, L(O)D

Steve Baskauf steve.baskauf at vanderbilt.edu
Mon Nov 15 21:00:55 CET 2010


Because I am an RDF/OWL novice, I will not say much about how I think 
Jonathan/Bob's comments apply to the discussion about the scope of 
Individual.  But I think that the suggestion that there may be stricter 
and broader uses of an Individual class probably describes the root of 
the disagreement pretty well.  I have a particular, narrow use-case in 
mind (facilitating resampling and inferring duplicates using some kind 
of taxonomically homogeneous entity).  Rich wants a broader 
interpretation based on ideas of what an individual means in various 
contexts.  I think this difference in outlook is reflected in this 
statement from Rich's last response:

'We can argue about the properties and tokens later; first we need to nail
down the "essence" of an Individual.'

I actually disagree strongly with this statement.  I have tried to stay 
out of the current thread about the future course of the TDWG ontology 
because it isn't my something that I know much about.  But I think I am 
leaning to the side of those who suggest that we create use cases first 
and then see how the ontology can be developed to facilitate those use 
cases.  I am actively using the class Individual in RDF way I have 
defined it in my proposal.  I know of at least one other person who 
plans to do so as well.  It is not clear to me what the use case is for 
doing what Rich wants: combining what I've called the "token" aspect of 
individuals with the "resampling" aspect.  It may be that there is such 
a use, but I'd like to see how it will work - in particular how it will 
work in RDF without "breaking" the use that I need for the Individual 
class.  If it is possible to combine the two aspects of individuals, 
perhaps that might be done in a "lax" definition (using Paul's term) 
that reflects the "essence" of an individual.  Unfortunately, figuring 
out what the "essence" is of an individual is more difficult than 
showing how one plans to use the term. 

Steve

Bob Morris wrote:
> Jonathan Reese, an employee of the Science Commons and TDWG member
> (and who knows way more about semantic web than I do) recently sent me
> this. I copy it here with his permission. Each of the paragraphs seems
> to me to be germane in different ways to the discussions about what
> should be an Individual. For those not deep into RDF, for the word
> "axiom", you could loosely understand "rule", although that term also
> has technical meaning that is sometimes a little different. Jonathan
> raises an important use case in the second paragraph, which is data
> quality control.  That's a topic of interest to many, but especially
> those following the new Annotation Interest Group. Originally, this
> was part of a discussion we had about my favorite hobby horse,
> rdfs:domain.  He is not on my side.  When people who know more than I
> do about something are skeptical of my arguments about it, I usually
> suspend disbelief and temporarily adopt their position.
>
> Jonathan's first point is pretty much what Paul Murray observed
> yesterday in response to a question of Kevin Richards.
>
>
> "(a) subclassing is the way in RDFS or OWL you would connect the more
> specific to the less specific, so that you can apply general theorems
> to a more specific entity.  That is, a well-documented data set would
> be rendered using classes and properties that were very specific so as
> to not lose information, and then could be merged with a
> badly-documented data set by relaxing to more general classes and
> properties using subclass and subproperty knowledge.
>
> (b) axioms (i.e. specificity) are valuable not only for expressing
> operational and inferential semantics, but also for "sanity checking"
> e.g. consistency, satisfiability, Clark/Parsia integrity checks (
> http://clarkparsia.com/pellet/icv/ ), and similar. Being able to
> detect ill-formed inputs is incredibly valuable.
>
> People talk past one another because there are many distinct use cases
> for RDF and assumptions are rarely surfaced. For L(O)D, you're
> interested in making lots of links with little effort. Semantics is
> the enemy because it drives up costs. For semantic web, on the other
> hand, you're interested in semantics, i.e. understanding and
> documenting the import of what's asserted and making a best effort to
> only assert things that are true, even in the presence of open world
> assumption and data set extensibility. Semantics is expensive because
> it requires real thought and often a lot of reverse engineering.
> People coming from these two places will never be able to get along."
> ---Jonathan Rees in email to Bob Morris
> ================
>
>
> Bob Morris
>
>   

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu



More information about the tdwg-tag mailing list