<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

Comments inline:<br>

<br>

Hilmar Lapp wrote:

<blockquote cite="mid:3DD7CAA7-3752-4DDE-BAB9-1682E68184DC@nescent.org"

 type="cite"><br>

  <div>

  <div>On May 3, 2011, at 9:00 PM, Steve Baskauf wrote:</div>

  <br class="Apple-interchange-newline">

  <blockquote type="cite"><span class="Apple-style-span"

 style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;">But

I was under the impression that one models things by describing classes

and the properties that connect them.</span></blockquote>

  <div><br>

  </div>

  <div>In OWL, properties connect instances, not classes. RDF allows

metaclasses (things that are classes and instances), but doing this

will throw most (all?) reasoners off the track.</div>

  </div>

</blockquote>

I knew I would get in trouble talking about this among experts. :-)&nbsp;

Thanks for the correction.&nbsp; I should have said "properties that connect

instances of those classes".&nbsp; I think that is what I meant.&nbsp; My point

was that in creating a model, one doesn't have to enumerate every

particular instance, particularly if there are many of them.&nbsp; One can

describe the class in general and let the users create the instances

that are appropriate for that class.&nbsp; <br>

<blockquote cite="mid:3DD7CAA7-3752-4DDE-BAB9-1682E68184DC@nescent.org"

 type="cite">

  <div><br>

  <blockquote type="cite"><span class="Apple-style-span"

 style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;">&nbsp;

Classes are (to me) a very different thing than instances of classes.&nbsp;

A model containing more than 13.6 million classes is at least 1.9

million times as complicated as a model with 7 classes.</span></blockquote>

  <div><br>

  </div>

  <div>Yes and no. I can model a taxonomy as a subclass hierarchy of

classes, or as a property-based (memberOf or some such) hierarchy of

individuals that all instantiate a single "Taxon" class. The former

isn't 1 million times more complex than the latter. However, they are

not identical either, and which approach one chooses has significant

consequences for how easy it is to express things about those taxa, and

for inferring new things from those with a DL reasoner. <br>

  </div>

  </div>

</blockquote>

Well, I guess I was influenced in my thinking about this by

<a class="moz-txt-link-freetext" href="http://wiki.tdwg.org/twiki/bin/view/TAG/SubclassOrNot">http://wiki.tdwg.org/twiki/bin/view/TAG/SubclassOrNot</a> , particularly

the part about the cats, which I can actually understand pretty well.&nbsp;

To me, the part of that wiki page that is most relevant to this

discussion is: <br>

<p>"Whether the subclassing option is preferable to the tagging

approach depends on the use of the ontology. The TDWG ontology's

principal role is not modeling the entire domain to permit inference

but allowing the mark up of data so that it will flow between

applications as freely as possible. It has to be something that is easy

to map into multiple technologies and something that people can agree

on rapidly.

</p>

<p>This strongly suggests that the tagging approach should be taken

wherever possible. First agree on the basic semantic units and model

the rest of the semantics with tagging. Only subclass when absolutely

necessary."

</p>

The really great thing about this is that I can dodge further

responsibility by just blaming my way of thinking on the people who

posted that page (Roger Hyam modified by Bob Morris, I think). :-)&nbsp; But

seriously, I think that the statement above pretty well summarizes what

may be the difference between what Pete and I are saying.&nbsp; My primary

concern is to allow "the mark up of data so that it will flow between

applications as freely as possible".&nbsp; Pete's point may be to permit

inferencing.<br>

.&nbsp; <br>

OK, so let's imagine that we mark up several million records of

specimens, tissue samples, and images as RDF.&nbsp; (We don't have to

imagine very hard, I think the BiSciCol group is planning to actually

do this within the next several months.)&nbsp; I would really like to hear

from some of the people who actually use "DL reasoners" (a group which

certainly does not include me) to know what it is that we could

actually find out that would be useful about that big data blob using

reasoners.&nbsp; I have already confessed that my primary concern is

enabling data discovery, transfer, and aggregation using GUIDs and

RDF.&nbsp; I'm still somewhat of a "semantic web" skeptic as far as the

whole inferencing thing is concerned.&nbsp; Aside from inferring

"duplicates", I'm really wanting to know what else there is useful that

could be reasoned outside of the Taxon/TaxonConcept class.&nbsp; (I can

imaging useful reasoning being done about things in that class like the

relationships among names,&nbsp; concepts, parent taxa, etc. e.g. Rod Page's

Biodiversity Informatics 3:1-15 article

<a class="moz-txt-link-freetext" href="https://journals.ku.edu/index.php/jbi/article/view/25">https://journals.ku.edu/index.php/jbi/article/view/25</a>)&nbsp; I think this

(data markup priority vs. inferencing priority) is an important

discussion to have before the tdwg community can settle on some kind of

consensus way of turning database records into RDF, particularly if it

is going to have a big influence on the way the RDF model is set up.&nbsp;

To me, there is a clear and immediate need to be able to mark data up

in a straightforward way.&nbsp; If we can get the semantic part, too, that

would be great but not at the expense of data markup.&nbsp; I just was at a

meeting of a bunch of herbarium curators.&nbsp; They desperately need a way

to implement GUIDs and aggregate data and they need it now.&nbsp; I really

don't think they care one whit about inferencing.&nbsp; If we coalesce on a

model that is great for doing cool things with 10 records but which

can't handle hundreds of thousands of records easily and simply, then

we are wasting our time.&nbsp; I don't think we need to dither about this

for another five years.<br>

<blockquote cite="mid:3DD7CAA7-3752-4DDE-BAB9-1682E68184DC@nescent.org"

 type="cite">

  <div><br>

  <blockquote type="cite"><span class="Apple-style-span"

 style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;">&nbsp;

I would hate to have to draw an RDF graph of that model</span></blockquote>

  </div>

  <div><br>

  </div>

  <div>I would as much hate to have to draw an RDF graph of 1.7 million

instances. The point being, in order to draw a graph of how someone

models a domain you don't draw a graph of the entire RDF triple store.</div>

</blockquote>

That was the point I was trying to make (I think).<br>

<br>

Thanks for the clarification, Hilmar.<br>

Steve<br>

<blockquote cite="mid:3DD7CAA7-3752-4DDE-BAB9-1682E68184DC@nescent.org"

 type="cite">

  <div><br>

  </div>

  <div><span class="Apple-tab-span" style="white-space: pre;"> </span>-hilmar</div>

  <br>

  <div> <span class="Apple-style-span"

 style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">

  <div style="">

  <div><font class="Apple-style-span" face="Monaco" size="3"><span

 class="Apple-style-span" style="font-size: 11px;">--&nbsp;</span></font></div>

  <div><font class="Apple-style-span" face="Monaco" size="3"><span

 class="Apple-style-span" style="font-size: 11px;">===========================================================</span></font></div>

  <div><font class="Apple-style-span" face="Monaco" size="3"><span

 class="Apple-style-span" style="font-size: 11px;">: Hilmar Lapp &nbsp;-:-

Durham, NC -:- informatics.nescent.org :</span></font></div>

  <div><font class="Apple-style-span" face="Monaco" size="3"><span

 class="Apple-style-span" style="font-size: 11px;">===========================================================</span></font></div>

  <div><br class="webkit-block-placeholder">

  </div>

  </div>

  </span><br class="Apple-interchange-newline">

  </div>

  <br>

</blockquote>

<br>

<pre class="moz-signature" cols="72">-- 

Steven J. Baskauf, Ph.D., Senior Lecturer

Vanderbilt University Dept. of Biological Sciences

postal mail address:

VU Station B 351634

Nashville, TN  37235-1634,  U.S.A.

delivery address:

2125 Stevenson Center

1161 21st Ave., S.

Nashville, TN 37235

office: 2128 Stevenson Center

phone: (615) 343-4582,  fax: (615) 343-6707

<a class="moz-txt-link-freetext" href="http://bioimages.vanderbilt.edu">http://bioimages.vanderbilt.edu</a>

</pre>

</body>

</html>