<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
I'm having problems with this in several ways, but I think the main one
at the moment is saying that PreservedSpecimen is a rdfs:subClassOf
Occurrence. If we do that, then we say that all instances of
PreservedSpecimen are also instances of Occurrence. Based on the
discussion that took place in the thread about what is an Occurrence, I
thought the consensus was that the Occurrence was a separate resource
from the token that documents it. If that is true, then it does not
make sense to say that a PreservedSpecimen is an Occurrence. Rather,
in RDF I would say:<br>
<br>
<rdf:Description
rdf:about=<a class="moz-txt-link-rfc2396E" href="http://something.org/12345#occurrence">"http://something.org/12345#occurrence"</a>><br>
<rdf:type
rdf:resource=<a class="moz-txt-link-rfc2396E" href="http://rs.tdwg.org/dwc/dwctype/Occurrence">"http://rs.tdwg.org/dwc/dwctype/Occurrence"</a>/><br>
... other properties of the Occurrence such as dwc:recordedBy ...<br>
</rdf:Description><br>
<rdf:Description rdf:about=<a class="moz-txt-link-rfc2396E" href="http://something.org/12345#specimen">"http://something.org/12345#specimen"</a>><br>
<rdf:type
rdf:resource=<a class="moz-txt-link-rfc2396E" href="http://rs.tdwg.org/dwc/dwctype/PreservedSpecimen">"http://rs.tdwg.org/dwc/dwctype/PreservedSpecimen"</a>/><br>
... other properties of the specimen such as dwc:preparations ...<br>
</rdf:Description><br>
<br>
If it is NOT true that an Occurrence is a separate resource from the
token, then do we want to say:<br>
<br>
<rdf:Description rdf:about=<a class="moz-txt-link-rfc2396E" href="http://something.org/12345">"http://something.org/12345"</a>><br>
<rdf:type
rdf:resource=<a class="moz-txt-link-rfc2396E" href="http://rs.tdwg.org/dwc/dwctype/Occurrence">"http://rs.tdwg.org/dwc/dwctype/Occurrence"</a>/><br>
<rdf:type
rdf:resource=<a class="moz-txt-link-rfc2396E" href="http://rs.tdwg.org/dwc/dwctype/PreservedSpecimen">"http://rs.tdwg.org/dwc/dwctype/PreservedSpecimen"</a>/><br>
... other properties of the Occurrence such as dwc:recordedBy ...<br>
... other properties of the specimen such as dwc:preparations ...<br>
</rdf:Description><br>
<br>
?<br>
If we go by this approach, (as you seem to suggest by<br>
<font class="Apple-style-span" face="arial, sans-serif"><span
class="Apple-style-span" style="border-collapse: collapse;">Why
not? Nothing in Darwin Core says that you can't create a schema in
which an Occurrence is supported by multiple "tokens", each with its
own basisOfRecord. Of course, Simple Darwin Core doesn't support that,
and maybe that's where these perceptions of inadequacy come from. <br>
</span></font>) then that suggests to me that we could say that an
Occurrence which is documented by both a specimen and an image would be
described as:<br>
<br>
<rdf:Description rdf:about=<a class="moz-txt-link-rfc2396E" href="http://something.org/12345">"http://something.org/12345"</a>><br>
<rdf:type
rdf:resource=<a class="moz-txt-link-rfc2396E" href="http://rs.tdwg.org/dwc/dwctype/Occurrence">"http://rs.tdwg.org/dwc/dwctype/Occurrence"</a>/><br>
<rdf:type
rdf:resource=<a class="moz-txt-link-rfc2396E" href="http://rs.tdwg.org/dwc/dwctype/PreservedSpecimen">"http://rs.tdwg.org/dwc/dwctype/PreservedSpecimen"</a>/><br>
<rdf:type
rdf:resource=<a class="moz-txt-link-rfc2396E" href="http://purl.org/dc/dcmitype/StillImage">"http://purl.org/dc/dcmitype/StillImage"</a>/><br>
... other properties of the Occurrence such as dwc:recordedBy ...<br>
... other properties of the specimen such as dwc:preparations ...<br>
... other properties of the image such as dcterms:rights,
mrtg:caption, etc. ...<br>
</rdf:Description><br>
<br>
and you get odd statements like saying that occurrences have
copyrights, images have preparations, etc.<br>
<br>
Ugh. It gets even worse. I just looked at
<a class="moz-txt-link-freetext" href="http://darwincore.googlecode.com/svn/trunk/rdf/dwctype.rdf">http://darwincore.googlecode.com/svn/trunk/rdf/dwctype.rdf</a> which I
guess is the official RDF description of the dwctypes. According to
it, not only is PreservedSpecimen a subclass of Occurrence, but
Occurrence is a subclass of <a class="moz-txt-link-freetext" href="http://purl.org/dc/dcmitype/Event">http://purl.org/dc/dcmitype/Event</a> . That
means that not only is an instance of PreservedSpecimen an instance of
Occurrence, but it is also by inference an instance of Event, (even if
we don't explicitly state that using rdf:type). Thus the resource I've
represented in RDF above is all at once an event, an occurrence, a
specimen, and an image. That really mixes up entities that we seemed
to be considering separate things in that thread. I guess if this kind
of subclassing is in the RDF, that's the way it is. But I don't like
it. <br>
<br>
This was all just starting to make sense to me when I had decided that
it was best to go against what I said in the Biodiversity Informatics
paper and separate Occurrences from their tokens. Maybe this
separation only matters if one is trying to clearly define what is a
property of what as in RDF and doesn't matter in databases where you
don't really have to be clear about the subject of properties. But it
seems to me like it's overloading the dwctypes to say that they should
both serve as the way to define they type of a thing (i.e. the class to
which the thing belongs) and to indicate the relationship that some
thing has to something else (i.e. hasASpecimen, hasAnImage,
hasAnObservation, etc.) as one is apparently doing with basisOfRecord.
I guess if one objects to this kind of overloading, one could just use
the URIs of the DwC classes themselves as values for rdf:type rather
than using the dwctype URIs. The RDF there doesn't seem to have any
kind of subclassing (but there is the problem of typing
PhysicalSpecimen which is not a generic DwC class).<br>
<br>
By the way, we don't have one DwCType for every DwC (or borrowed Dublin
Core) Class as you said. There are no types for Identification, Event,
and GeologicalContext<br>
<br>
Thanks for the explanation - I'll keep trying to understand.<br>
<br>
Steve<br>
<br>
John Wieczorek wrote: [some pieces cut out to paste above]<br>
<blockquote
cite="mid:AANLkTinierkxWJVFgrT7tO69X4SGYydjY4Hc3jX8aZ3o@mail.gmail.com"
type="cite">Sorry for the delay. Just back from Tanzania.
<div><br>
</div>
<div>The basisOfRecord is meant to classify the content of the
resource (record) as specifically as possible, in the sense of how the
contributor of the resource intended for it to be used. The value of
the basisOfRecord should be based on the most specific Class (a
DwCtype) of information the resource (record) represents. Thus, if an
Occurrence record was supported by a PreservedSpecimen as evidence,
then the record containing the specimen information should have
PreservedSpecimen as the basisOfRecord. It's still an Occurrence,
because PreservedSpecimen is a formal refinement of an Occurrence in
the RDF sense of <rdfs:subClassOf rdf:resource="<a
moz-do-not-send="true" href="http://rs.tdwg.org/dwc/dwctype/Occurrence">http://rs.tdwg.org/dwc/dwctype/Occurrence</a>"/>.
So, your example record should have PreservedSpecimen as its
basisOfRecord, and that means it is an Occurrence record, because all
PreservedSpecimen records are.</div>
<div><br>
</div>
<div>Why have the Occurrence type? For consistency. We have one
DwCType for every DwC (or borrowed Dublin Core) Class. Yet, we are much
more interested in the peculiarities of Occurrences than we are in
different types of, say, Events, because of the implications for their
use. So we have gone to the effort to create several subclasses of
Occurrence based on demonstrated requirements.</div>
<div><br>
</div>
<div>Why have a basisOfRecord at all? To assist the data consumer to
know the ways in which the resource (record) might be used. Without the
basisOfRecord term, the detailed content of the records would have to
be assessed to assert what the record was about.</div>
<div><br>
</div>
<div>Steve said,</div>
<div>"<span class="Apple-style-span"
style="font-family: arial,sans-serif; border-collapse: collapse;">But
realistically, any such table that includes multiple token types isn't
going to work very well."</span></div>
<div><span class="Apple-style-span"
style="font-family: arial,sans-serif; border-collapse: collapse;"><br>
</span></div>
<div><span class="Apple-style-span"
style="font-family: arial,sans-serif; border-collapse: collapse;">and,</span></div>
<div><span class="Apple-style-span"
style="font-family: arial,sans-serif; border-collapse: collapse;"><br>
</span></div>
<div><span class="Apple-style-span"
style="font-family: arial,sans-serif; border-collapse: collapse;">"</span><span
class="Apple-style-span"
style="font-family: arial,sans-serif; border-collapse: collapse;">basisOfRecord
also has no use for Occurrences that have several tokens. Which of the
several tokens are we saying is the "basis" of the record?"</span></div>
<div><span class="Apple-style-span"
style="font-family: arial,sans-serif; border-collapse: collapse;"><br>
</span></div>
<div><font class="Apple-style-span" face="arial, sans-serif"><span
class="Apple-style-span" style="border-collapse: collapse;">Why not?
Nothing in Darwin Core says that you can't create a schema in which an
Occurrence is supported by multiple "tokens", each with its own
basisOfRecord. Of course, Simple Darwin Core doesn't support that, and
maybe that's where these perceptions of inadequacy come from. </span></font></div>
<div><font class="Apple-style-span" face="arial, sans-serif" size="1"><span
class="Apple-style-span" style="border-collapse: collapse;"><br>
</span></font></div>
<div><br>
<div class="gmail_quote">On Wed, Oct 27, 2010 at 10:34 AM, Steve
Baskauf <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:steve.baskauf@vanderbilt.edu" target="_blank">steve.baskauf@vanderbilt.edu</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div bgcolor="#ffffff" text="#000000">
<table border="1" cellpadding="2" cellspacing="2" width="100%">
<tbody>
<tr>
<td valign="top">occurrenceID<br>
</td>
<td valign="top">recordedBy<br>
</td>
<td valign="top">other Occurrence terms<br>
</td>
<td valign="top">preparations<br>
</td>
<td valign="top">other specimen-related terms<br>
</td>
<td valign="top">basisOfRecord<br>
</td>
</tr>
<tr>
<td valign="top"><a moz-do-not-send="true"
href="http://herbarium.org/12345" target="_blank">http://herbarium.org/12345</a><br>
</td>
<td valign="top">Joe Curator<br>
</td>
<td valign="top">...<br>
</td>
<td valign="top">pressed and dried<br>
</td>
<td valign="top">...<br>
</td>
<td valign="top">???<br>
</td>
</tr>
</tbody>
</table>
<br>
OK, above is an example of a database record that I would consider
typical. The manager has flattened the general model we have been
discussing to merge the Occurrence resource with the token resource
since in his/her database no occurrence has any token other than a
single specimen. So what is the value for basisOfRecord: Occurrence or
PreservedSpecimen? What I'm getting at here is that there seems to be
ambiguity as to whether we intend for basisOfRecord to represent the
type of the record (which in this case I would say is Occurrence) or
the type of the token on which the record is based (which in this case
is PreservedSpecimen). In that lengthy discussion that happened last
October, there was discussion of having the proposed recordClass
represent the type of the overall record (Occurrence, Event, Location,
etc.) and basisOfRecord representing the type of the token or evidence
on which an Occurrence record is based (PreservedSpecimen). I'm not
sure what is intended now. <br>
<br>
I see this lack of clarity as being a consequence of the general
blurring of the distinction between the Occurrence and the "token". I
feel like there is a general consensus that we need to tighten up our
definitions regarding Occurrences and their evidence even if some
people will prefer to continue using the "flattened" approach to
Occurrences and their tokens illustrated above (which they are entitled
to do). I don't have a problem with the various types that you've
listed below. The problem is that I don't think the definition of
basisOfRecord makes clear how it (basisOfRecord) should apply - the
definition just says "the specific nature of the data record" and that
the controlled vocabulary of the Darwin Core Type Vocabulary should be
used. Since the Type Vocabulary includes all of the classes (Event,
Location, Occurrence, etc.) in addition to the types of "tokens", I
believe that some users might think that making a statement like<br>
basisOfRecord="Event" <br>
is correct. If we intend for basisOfRecord to ONLY apply to
Occurrences, and for basisOfRecord to ONLY have as valid values the
types that apply to tokens (i.e. PreservedSpecimen, LivingSpecimen,
HumanObservation), then we should say so explicitly in the definition.
If we do not say this, then we end up with the kind of ambiguity that I
illustrated above. basisOfRecord ends up getting "overloaded" to
represent general classes of records and also to represent the types of
tokens.<br>
<br>
In addition, I'm not entirely convinced that basisOfRecord actually has
any use at all. It seems to be intended for situations such as I've
illustrated above where a database table has rows that could contain
occurrences documented with different types of tokens. Ostensibly,
basisOfRecord is needed to tell us the type of the token. But
realistically, any such table that includes multiple token types isn't
going to work very well. For example, if the table includes
Occurrences that are documented by specimens, images, and no
token/memories (i.e. HumanObservations), then it's going to have a
bunch of columns that are empty for any particular record. The
specimens rows won't have anything in the image term columns, the
images won't have anything in the specimen term columns, and the human
observations won't have anything in either the specimen or image term
columns. I think most database managers would just include the terms
that apply to all Occurrences in the Occurrence table and then use
identifiers to link separate tables for the metadata terms that are
specific to the various types of tokens. basisOfRecord also has no use
for Occurrences that have several tokens. Which of the several tokens
are we saying is the "basis" of the record?<br>
<br>
So again, my basic question is not so much about the various types and
subtypes to which we can refer, but specifically how the basisOfRecord
term is to be used.<br>
<br>
Steve<br>
<br>
Blum, Stan wrote:
<blockquote type="cite">
<div>
<div>
<pre>Steve,
I'm wasn't involved in those final discussions of dwc:basisOfRecord and the
type vocabulary, but I don't see a difficulty. The Dublin Core type
vocabulary includes the following:
Collection
Dataset
Event
Image
InteractiveResource
MovingImage
PhysicalObject
Service
Software
Sound
StillImage
Text
The Darwin Core types extend that with the three additional types, and with
Occurrence being further subtyped with those different kinds of Occurrences.
Location
Taxon
Occurrence
PreservedSpecimen
FossilSpecimen
LivingSpecimen
HumanObservation
MachineObservation
NomenclaturalChecklist
Data publishers/providers should categorize their Occurrence data as one of
those subtypes (or perhaps another subtype that wasn't included in that
list). It's up to the consumer to decide which kind of Occurrence subtypes
are appropriate for a particular use.
Could you give examples of the inconsistent use of values in basisOfRecord ?
Also note, John W. is traveling at the moment. Markus might be able to
provide additional thoughts.
-Stan
On 10/25/10 10:34 PM, "Steve Baskauf" <a moz-do-not-send="true"
href="mailto:steve.baskauf@vanderbilt.edu" target="_blank"><steve.baskauf@vanderbilt.edu></a> wrote:
</pre>
<blockquote type="cite">
<pre>OK, I know that this sounds like a stupid question, but I really want
somebody who was involved in the development and maintenance of the
current DwC standard to tell me how the term dwc:basisOfRecord is
supposed to be used (not what it IS - I've seen the definition at
<a moz-do-not-send="true"
href="http://rs.tdwg.org/dwc/terms/index.htm#basisOfRecord"
target="_blank">http://rs.tdwg.org/dwc/terms/index.htm#basisOfRecord</a>)? I would like for
the answer of this question to be separated from the issue of what the
Darwin Core type vocabulary
(<a moz-do-not-send="true"
href="http://rs.tdwg.org/dwc/terms/type-vocabulary/index.htm"
target="_blank">http://rs.tdwg.org/dwc/terms/type-vocabulary/index.htm</a>) is for.
I re-read the lengthy thread starting with
<a moz-do-not-send="true"
href="http://lists.tdwg.org/pipermail/tdwg-content/2009-October/000301.html"
target="_blank">http://lists.tdwg.org/pipermail/tdwg-content/2009-October/000301.html</a>
which talked a lot about basisOfRecord and its relationship to other
ways of typing things. I don't want to re-plough that ground again, but
I couldn't find the post that stated what the final decision was. I
remember that there was a decision to NOT create the recordClass term
which was the subject of much discussion.
I guess my confusion at this point is with the inclusion of both
"Occurrence" and "PreservedSpecimen" in the same list. Let's say that I
have a flat database where I include metadata about the Occurrence (such
as dwc:recordedBy) and the specimen (such as dwc:preparations) in the
same line. What is the basisOfRecord for that line? I would guess that
the "basis of the record" was the specimen. But the line in the record
also represents an Occurrence. It seems like there is a lack of clarity
as to whether basisOfRecord is supposed to indicate the type of the
record (which would be an Occurrence record) or whether it's supposed to
indicate the kind of evidence on which the record is based (which would
be PreservedSpecimen). There have been various times where I've seen a
database record that includes basisOfRecord and it seems to be
inconsistently applied.
I can see how the Darwin Core type vocabulary could be useful - it
pretty much lays out useful values that one could give for rdfs:type.
But basisOfRecord as a term is confusing me.
Steve
</pre>
</blockquote>
</div>
</div>
<pre><div><div>
_______________________________________________
tdwg-content mailing list
<a moz-do-not-send="true" href="mailto:tdwg-content@lists.tdwg.org"
target="_blank">tdwg-content@lists.tdwg.org</a>
<a moz-do-not-send="true"
href="http://lists.tdwg.org/mailman/listinfo/tdwg-content"
target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a></div></div>
.
</pre>
</blockquote>
<div><br>
<pre cols="72">--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
<a moz-do-not-send="true" href="http://bioimages.vanderbilt.edu"
target="_blank">http://bioimages.vanderbilt.edu</a>
</pre>
</div>
</div>
<br>
_______________________________________________<br>
tdwg-content mailing list<br>
<a moz-do-not-send="true" href="mailto:tdwg-content@lists.tdwg.org"
target="_blank">tdwg-content@lists.tdwg.org</a><br>
<a moz-do-not-send="true"
href="http://lists.tdwg.org/mailman/listinfo/tdwg-content"
target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
<a class="moz-txt-link-freetext" href="http://bioimages.vanderbilt.edu">http://bioimages.vanderbilt.edu</a>
</pre>
</body>
</html>