<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
John reminded me that I hadn't responded to his last post in the thread
about my call for action on my proposal
(<a class="moz-txt-link-freetext" href="http://code.google.com/p/darwincore/issues/detail?id=68">http://code.google.com/p/darwincore/issues/detail?id=68</a>) for adding
DigitalStillImage to the DwC type vocabulary
(<a class="moz-txt-link-freetext" href="http://rs.tdwg.org/dwc/terms/type-vocabulary/index.htm">http://rs.tdwg.org/dwc/terms/type-vocabulary/index.htm</a>) to serve as a
controlled value for basisOfRecord
(<a class="moz-txt-link-freetext" href="http://rs.tdwg.org/dwc/terms/index.htm#basisOfRecord">http://rs.tdwg.org/dwc/terms/index.htm#basisOfRecord</a>). After he sent
the reply I actually laid in bed awake pondering what it was that
disturbed me so much about basisOfRecord. After mulling it over for a
few days I have concluded that the problem is that basisOfRecord and
the DwC type vocabulary are overloaded too much. I will try to briefly
state why I think this and then say what I think the implications are
for my DigitalStillImage proposal. Apologies in advance to Bob for
sloppy use of technical terms.<br>
<br>
THE PURPOSES<br>
It seems like there are basically three separate purposes that
basisOfRecord and the dwctype vocabulary can serve/do serve/are
intended to serve:<br>
1. To define the types of resources, e.g. to serve as an object for the
predicate rdf:type . Semantic reasoning based on assigning an rdf:type
property to a resource would indicate that the resource was an instance
of an rdfs:Class (see <a class="moz-txt-link-freetext" href="http://www.w3.org/TR/rdf-schema/#ch_type">http://www.w3.org/TR/rdf-schema/#ch_type</a>).<br>
2. To serve as a property of an instance of dwc:Occurrence to allow a
user to evaluate its fitness of use, e.g. an Occurrence having
basisOfRecord=PreservedSpecimen is an Occurrence having supporting
evidence (or a "token") that is a preserved specimen. <br>
3. To formally define the ontological relationship among DwC classes
through the assignment of an rdfs:subClassOf property to a member of
the DwC type vocabularly (see
<a class="moz-txt-link-freetext" href="http://www.w3.org/TR/rdf-schema/#ch_subclassof">http://www.w3.org/TR/rdf-schema/#ch_subclassof</a>). As noted below, the
current term definitions state that dwctype:PreservedSpecimen is a
subclass of dwctype:Occurrence which is a subclass of dcmitype:Event<br>
<br>
I think that at the time of the adoption of Darwin Core as a standard,
this overloading may have made sense, but now that people are
considering using Darwin Core terms in RDF, this overloading is a
hindrance, not a help. If I state that a resource has
rdf:type=dwctype:PreservedSpecimen in an attempt to describe what the
resource is, I end up causing unintended inferences to be drawn by
semantic reasoners. If I state that a resource has
dwc:basisOfRecord=dwctype:PreservedSpecimen, it is not clear whether
I'm talking about the Occurrence or the physical specimen itself. <br>
<br>
SUGGESTIONS<br>
I would like to suggest that the way forward here is to separate these
three functions, at least for the present. It seems clear from the
recent discussion on the email list that the kind of subclassing that
currently exists (#3 above) does not represent the way that many people
in the Darwin Core constituency are looking at PreservedSpecimens,
Occurrences, and Events. I would recommend removing the
rdfs:subClassOf properties from all of the terms in the Darwin Core
vocabulary until more substantial work is completed (with community
consensus) on the TDWG ontology. To facilitate purpose #1, I would
also recommend that all current Darwin Core classes be represented in
the dwctype vocabulary. As I noted, the dwc:Identification class and
some others are missing. Based on what John said below, I guess this
was intentional, but if dwctype was intended for use #1, then all of
the classes should be there. I actually believe that it would be
beneficial to have a PhysicalSpecimen class and to move some Occurrence
terms to it (similar issues with Taxon class), but that is a hornet's
nest that I don't want to kick at the moment. But anyway it would be
relatively straightforward to just include all of the existing classes
as dwctypes. <br>
<br>
Purpose #2 is the one that seems the most problematic to me. It seems
to me that if we want to refine Occurrence (as seems to be the purpose
of making PreservedSpecimen, HumanObservation, etc. subclasses of
Occurrence), then it would be better to create separate terms for those
refined Occurrences to be used as the object of basisOfRecord rather
than using the dwctypes. Thus we would have something like
basisOfRecord="OccurrenceWithEvidencePreservedSpecimen", etc. rather
than basisOfRecord="PreservedSpecimen". It would then be clear that
the subject is the Occurrence, not the evidence. With this approach,
there would be no problem with making the statement
[OccurrenceWithEvidencePreservedSpecimen] rdfs:subClassOf
[dwctype:Occurrence] in contrast with saying [PreservedSpecimen]
rdfs:subClassOf [dwctype:Occurrence]. I'm not saying that this is a
good thing to do - in fact I don't like it at all based on the
philosophy that Roger laid out in
<a class="moz-txt-link-freetext" href="http://wiki.tdwg.org/twiki/bin/view/TAG/SubclassOrNot">http://wiki.tdwg.org/twiki/bin/view/TAG/SubclassOrNot</a>. But it is the
approach that would get us out of making statements that don't make
sense due to our lack of separation between the desire to refine
Occurrence and to define rdf:type's. <br>
<br>
ABOUT THE DigitalStillImage PROPOSAL<br>
Having said that, it seems like DigitalStillImage is not needed to
serve purpose #1. For that we already have
<a class="moz-txt-link-freetext" href="http://purl.org/dc/dcmitype/StillImage">http://purl.org/dc/dcmitype/StillImage</a> . However, if people want to do
#2, then DigitalStillImage should be there as a value for
basisOfRecord. As I said in the paragraph above,
"OccurrenceWithEvidenceDigitalStillImage" would probably be better than
"DigitalStillImage" for this purpose. But I don't really care anymore
because I have no intention of doing #2 since I would prefer not to
subclass Occurrence. Instead I would create a new term (desperately
needed, I think) that is a property of Occurrence and call it
"hasEvidence", "hasToken", "evidenceID", or "tokenID", and that
connects an Occurrence to an evidence URI. I would then type the
evidence using rdf:type. But from what John says, it seems that there
may be a lot of people who want to subclass Occurrence and for their
benefit perhaps DigitalStillImage should be there. <br>
<br>
>From my point of view, I would be happy with either letting the
proposal go up for a vote as is and let the TAG accept it or shoot it
down, or to "freeze" the proposal (something John at some point hinted
might be possible) until some future time when the ontology and RDF
development have reached a point where there is a consensus view
(expressed in the DwC standard itself, not just in the email list)
about what an Occurrence is and what its properties (including
basisOfRecord) should be and mean. I just think that I'm ready to be
done with worrying about it.<br>
<br>
Steve<br>
<br>
P.S. I'm going to stick in a few inline comments below as appropriate.
Especially take note of the comment regarding the creation of new
classes and typing.<br>
<br>
John Wieczorek wrote:
<blockquote
cite="mid:AANLkTi=jwq2K0sR5yKR5xvPsoJGu3cdWA4fwWLz9va8A@mail.gmail.com"
type="cite">...<br>
<div>
<div><span class="Apple-style-span"
style="font-family: Verdana,Arial,'Arial Unicode MS',Helvetica,sans-serif;"></span><br>
<div><br>
</div>
<div>I think we have a lot of work to go the next step, to get the
semantics right, for which choices have to be considered more carefully
than has been done in the past. Thankfully, it seems we have reached a
critical mass to try to meet the challenge. The effort is much
appreciated.</div>
</div>
</div>
</blockquote>
Yes that is encouraging!<br>
<blockquote
cite="mid:AANLkTi=jwq2K0sR5yKR5xvPsoJGu3cdWA4fwWLz9va8A@mail.gmail.com"
type="cite">
<div>
<div>
<div><br>
...<br>
<div>
<div class="gmail_quote">
<div><br>
</div>
<div>No. We don't want to say that, even the way Darwin Core is right
now, because specimens don't have properties separate from Occurrences
- they ARE Occurrences.</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
Well, I'm glad to hear you say that because I was starting to think
that I was crazy when I treated PreservedSpecimens, images, sounds,
etc. AS Occurrences in the Biodiversity Informatics paper. I think I
got that notion from reading your posts on the list last (northern
hemisphere) fall. <br>
<blockquote
cite="mid:AANLkTi=jwq2K0sR5yKR5xvPsoJGu3cdWA4fwWLz9va8A@mail.gmail.com"
type="cite">
<div>
<div>
<div>
<div>
<div class="gmail_quote">
<div>
...<br>
</div>
<div><br>
</div>
<div>Yes. What this says is that Darwin Core, as it currently stands,
doesn't care to distinguish between these classes. I agree that this is
a mistake. It makes me wonder if it is EVER appropriate to subClass. <br>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
OK, how hard would it be to get rid of the subClassOf properties in the
RDF term definitions? Is that something that the TAG could do by fiat
or would it have to go through the official DwC change process?<br>
<blockquote
cite="mid:AANLkTi=jwq2K0sR5yKR5xvPsoJGu3cdWA4fwWLz9va8A@mail.gmail.com"
type="cite">
<div>
<div>
<div>
<div>
<div class="gmail_quote">
<div><br>
</div>
...<br>
<div> </div>
<blockquote class="gmail_quote"
style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div bgcolor="#ffffff" text="#000000">I guess if one objects to
this kind of overloading, one could just use
the URIs of the DwC classes themselves as values for rdf:type rather
than using the dwctype URIs. The RDF there doesn't seem to have any
kind of subclassing (but there is the problem of typing
PhysicalSpecimen which is not a generic DwC class).<br>
</div>
</blockquote>
<div><br>
</div>
<div>And a new class would have to be created every time a new token
was conceived, and new predicates for every relationship of every new
token to every other Class they might relate to. Doesn't seem scalable
to try to define the whole biodiversity universe in this way, so what
is the "sweet spot" solution. Create only what you need to create to
solve the specific problem at hand? Don't try to standardize at this
level?</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
Well, upon further reflection, I'm thinking that it isn't necessary to
create a new class for every kind of token. I have been mentally
equating Darwin Core classes with rdfs:Class'es. They could be the
same thing, but wouldn't have to be. What is important here is that we
have a way to meet the (somewhat vague) requirement in the GUID
Applicability statement
(<a class="moz-txt-link-freetext" href="http://www.tdwg.org/stdtrack/article/download/150/51">http://www.tdwg.org/stdtrack/article/download/150/51</a>) that resources
in the biodiversity informatics world should be typed using a
well-known vocabulary. If there are already well-known classes or
types for the thing we need to describe (like StillImage in Dublin Core
or Person in FOAF) then we can use them. The issue comes to a head
when other vocabularies don't have terms for the things we need to
type, e.g. Individuals and PreservedSpecimens. That's where DwC needs
to create either classes or dwctype terms (unencumbered with subClass
properties!). The problem here is that there needs to be a consensus
in the community about appropriate values for rdf:type in RDF provided
for GUIDs. Can a "semantic reasoner" program know that
<a class="moz-txt-link-freetext" href="http://rs.tdwg.org/dwc/dwctype/Occurrence">http://rs.tdwg.org/dwc/dwctype/Occurrence</a> is the same kind of thing as
<a class="moz-txt-link-freetext" href="http://rs.tdwg.org/dwc/terms/Occurrence">http://rs.tdwg.org/dwc/terms/Occurrence</a> and as the TDWG ontology URI
for Occurrence (which I can't remember at the moment)? I think not
unless we assert some kind of sameAs relationship, which seems
dangerous to me. The GUID applicability statement specifically
mentions the TDWG ontology, but I thing the GUID implementation is
happening too fast to wait for all of the required "cat herding" that
would be needed before a TDWG Ontology is done enough to serve this
purpose. I think dwctype is the best answer for things that aren't
already typed in another vocabulary. But I'm not going to use it while
the subclass problems are still there.<br>
<blockquote
cite="mid:AANLkTi=jwq2K0sR5yKR5xvPsoJGu3cdWA4fwWLz9va8A@mail.gmail.com"
type="cite">
<div>
<div>
<div>
<div>
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote"
style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div bgcolor="#ffffff" text="#000000">By the way, we don't have one
DwCType for every DwC (or borrowed Dublin
Core) Class as you said. There are no types for Identification, Event,
and GeologicalContext<br>
</div>
</blockquote>
<div><br>
</div>
<div>There is a type for Event (<a moz-do-not-send="true"
href="http://rs.tdwg.org/dwc/terms/type-vocabulary/index.htm#Event">http://rs.tdwg.org/dwc/terms/type-vocabulary/index.htm#Event</a>),
but not for the other two. When I said "<span class="Apple-style-span"
style="font-family: arial,sans-serif; border-collapse: collapse;">We
have one DwCType for every DwC (or borrowed Dublin Core) Class" I
should have added "for which it makes sense to have a record." <br>
</span></div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
As I implied above, I think it makes sense to have a dwctype for an
instance of any DwC class for which one might reasonably create a
separate GUID (which is probably all of the classes). <br>
<blockquote
cite="mid:AANLkTi=jwq2K0sR5yKR5xvPsoJGu3cdWA4fwWLz9va8A@mail.gmail.com"
type="cite">
<div>
<div>
<div>
<div>
<div class="gmail_quote">
<meta charset="utf-8">
<div><br>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
<a class="moz-txt-link-freetext" href="http://bioimages.vanderbilt.edu">http://bioimages.vanderbilt.edu</a>
</pre>
</body>
</html>