Markus,
Thanks! After I posted the other night I was thinking that the way I
had
constructed the RDF was a bit convoluted. Your example is much more
straightforward. (although except for a lack of typing on my part the
two examples actually result in the same triples).
As you noted, the other thing that my example has is that the record
for
the offspring makes reference to the dwc:resourceRelationshipID as a
way to assert that the offspring has that relationship. I would be
interested in some discussion about the appropriate use of the varous
dwc:xxxxID terms (e.g. dwc:occurrenceID, dwc:institutionID, etc.). As
was the case with the resourceRelationship terms, there aren't a lot of
examples showing the proper use of these terms, with the noteworthy
exception of
http://rs.tdwg.org/dwc/terms/guides/xml/index.htm
. As I
see it, there are two possible uses of these terms. One (the "first"
use) is to identify the actual resource that is the subject of a
record. The other (the "second" use) is as an "IDRef" term that
connects the subject resource to another record of a different type.
In the XML example, we see both uses. In the first example in section
2.7 we see
<dcterms:Location>
<dwc:locationID>
http://guid.mvz.org/sites/arg/127</dwc:locationID>
... more metadata...
</dcterms:Location>
<dwc:Occurrence>
... more metadata ...
<dwc:occurrenceID>urn:catalog:MVZ:Mammals:14523</dwc:occurrenceID>
... more metadata ...
<dwc:locationID>
http://guid.mvz.org/sites/arg/127</dwc:locationID>
</dwc:Occurrence>
where in the dcterms:Location record, dwc:locationID indicates the
identifier for the Location itself (i.e. the subject), while in the
dwc:Occurrence record, dwc:locationID refers to the object of the
location property of the Occurrence. Because of the freewheeling
nature of generic XML files, I guess this is OK. However, when I'm
trying to think clearly about how to express relationships in RDF, I've
come to decide that I don't like this dual use. I would express the
above relationships in RDF as follows:
<rdf:Description rdf:about=
"http://guid.mvz.org/sites/arg/127">
<rdfs:type rdf:resource=
"http://purl.org/dc/terms/Location"/>
...metadata...
</rdf:Description>
<rdf:Description
rdf:about=
"http://resolver.org/urn:catalog:MVZ:Mammals:14523">
<rdfs:type
rdf:resource=
"http://rs.tdwg.org/dwc/terms/Occurrence"/>
... metadata...
<dwc:locationID
rdf:resource=
"http://guid.mvz.org/sites/arg/127"/>
</rdf:Description>
To me this makes the most sense. It corresponds to the "second" use in
the XML example (it indicates the Location property of the Occurrence,
i.e. serves as the predicate connecting the subject Occurrence to the
object Location). One could also do something like
<rdf:Description rdf:about=
"http://guid.mvz.org/sites/arg/127">
<dwc:locationID
rdf:resource=
"http://guid.mvz.org/sites/arg/127"/>
<rdfs:type rdf:resource=
"http://purl.org/dc/terms/Location"/>
...metadata...
etc.
or
<rdf:Description rdf:about=
"http://guid.mvz.org/sites/arg/127">
<dwc:locationID>
http://guid.mvz.org/sites/arg/127</dwc:locationID>
<rdfs:type rdf:resource=
"http://purl.org/dc/terms/Location"/>
...metadata...
etc.
But this seems rather pointless because in the first case, the
rdf:about attribute already provides information about the subject of
the relationship in a Linked Data manner. In the second case, why have
a special term to indicate a literal version of the identifier when
there is a more generic and well-known term that is common use? e.g.
<rdf:Description rdf:about=
"http://guid.mvz.org/sites/arg/127">
<dcterms:identifier>
http://guid.mvz.org/sites/arg/127</dcterms:identifier>
<rdfs:type rdf:resource=
"http://purl.org/dc/terms/Location"/>
...metadata...
etc.
In contrast, there is no simple alternative that I can see (with the
possible exception of the dwc:relatedResource, which I wouldn't call a
SIMPLE alternative) to using dwc:locationID as the predicate that
points to the Location object of an Occurrence subject. Darwin Core
doesn't have any terms specifically designed for this (such as
"hasLocation" or "hasIdentificatin" for example) but I don't see any
reason why the dwc:xxxxID terms couldn't be used this way.
The main issue I can see with the use of these dwc:xxxxID terms is
their placement in classes which I guess is related to their
rdfs:domain . Although the term description comments at
http://darwincore.googlecode.com/svn/trunk/rdf/dwcterms.rdf
(is this
the actual place where DwC is officially defined?) say that each term
has a rdfs:domain property, they actually don't in the RDF. So the
only real clue to the intended subjects of the dwc:xxxxID terms is the
placement of the dwc:xxxxID terms into classes. In most cases, a
dwc:xxxxID term is placed in the class of the thing that the term is
identifying (e.g. occurrenceID is in the Occurrence class,
identificationID is in the Identification class, etc.). This would
hint that the terms were intended to be used with instances of their
class as their subject. However, if the terms were used as IDRefs (the
"second" use shown in the XML example) then they really should be in
different classes. For example, dwc:locationID could be a member of
the class Occurrence, since an Occurrence could have dwc:locationID as
a property with an instance of a Location as its object. But then
there might be the question of whether an Event could also have a
dwc:locationID as a property and dwc:locationID shouldn't be in two
classes. In the absence of defined rdfs:domains, I guess the terms can
be used with pretty much any subject a user thinks makes sense. As Bob
Morris pointed out in a recent post, there aren't very many
applications that are doing much in the way of semantic reasoning at
this point. But I suppose (hope?) that there could be some in the
future, so achieving some kind of consensus on how the terms should be
used in RDF would probably be good.
At this point, the dwc:xxxxID terms serve a useful purpose for me in
connecting one kind of resource to another (examples in
http://bioimages.vanderbilt.edu/ind-baskauf/41870.rdf
and
http://bioimages.vanderbilt.edu/baskauf/51363.rdf),
so unless somebody
tells me that's "wrong" and comes up with a better term to describe the
relationships I need to describe, I'll keep using the dwc:xxxxID terms
as IDRefs.
Comments?
Steve
Markus Döring wrote:
I must say it feels strange to use the id terms in rdf to refer to rdf:resources instead of holding ID literals.
More natural to the rdf language would be <dwc:resource rdf:resource="..." /> and <dwc:relatedResource rdf:resource="..." /> I suppose, but the dwc vocabulary was built for different technologies, not rdf alone. So I guess thats the price we have to pay.
Markus
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
http://bioimages.vanderbilt.edu